Compare commits

...

No commits in common. "v1.1.0" and "master" have entirely different histories.

72 changed files with 8185 additions and 8814 deletions

View file

@ -1,58 +1,81 @@
# connectd environment variables
# copy to .env and fill in your values
# === REQUIRED FOR LLM DRAFTING ===
# === REQUIRED ===
GROQ_API_KEY=
GROQ_MODEL=llama-3.1-70b-versatile
GROQ_MODEL=llama-3.3-70b-versatile
# === DISCOVERY SOURCES ===
# github (optional - works without token but rate limited)
# === DISTRIBUTED MODE (optional) ===
# for coordinating multiple connectd instances
CONNECTD_CENTRAL_API=
CONNECTD_API_KEY=
CONNECTD_INSTANCE_ID=
CONNECTD_INSTANCE_IP=
# === DISCOVERY: GITHUB ===
# works without token but heavily rate limited
GITHUB_TOKEN=
# mastodon (for DM delivery)
# === DISCOVERY: FEDIVERSE ===
MASTODON_TOKEN=
MASTODON_INSTANCE=mastodon.social
MASTODON_INSTANCE=
# bluesky (for DM delivery)
BLUESKY_HANDLE=
BLUESKY_APP_PASSWORD=
# matrix (for DM delivery)
MATRIX_HOMESERVER=
MATRIX_USER_ID=
MATRIX_ACCESS_TOKEN=
# discord (for discovery + DM delivery)
DISCORD_BOT_TOKEN=
DISCORD_TARGET_SERVERS= # comma separated server IDs
# lemmy (for authenticated access to your instance)
LEMMY_INSTANCE=
LEMMY_USERNAME=
LEMMY_PASSWORD=
# === EMAIL DELIVERY ===
# === DISCOVERY: OTHER ===
DISCORD_BOT_TOKEN=
DISCORD_TARGET_SERVERS=
# === DELIVERY: EMAIL ===
SMTP_HOST=
SMTP_PORT=465
SMTP_USER=
SMTP_PASS=
FROM_EMAIL=connectd <connectd@yourdomain.com>
FROM_EMAIL=
# === DELIVERY: SOCIAL ===
# mastodon - reuses discovery token above
# MASTODON_TOKEN=
# MASTODON_INSTANCE=
BLUESKY_HANDLE=
BLUESKY_APP_PASSWORD=
MATRIX_HOMESERVER=
MATRIX_USER_ID=
MATRIX_ACCESS_TOKEN=
# === DELIVERY: FORGE ISSUES ===
# for creating issues on self-hosted git forges
# highest signal outreach - these people actually selfhost
# codeberg (largest public gitea instance)
CODEBERG_TOKEN=
# gitea/forgejo instances - format: GITEA_TOKEN_<host_with_underscores>=token
# examples:
# GITEA_TOKEN_git_example_com=your-token
# GITEA_TOKEN_192_168_1_8_3000=your-token
# gitlab CE instances - format: GITLAB_TOKEN_<host_with_underscores>=token
# examples:
# GITLAB_TOKEN_gitlab_example_com=your-token
# === HOST USER CONFIG ===
# the person running connectd - gets priority matching
# set HOST_USER to your github username and connectd will auto-discover your info
# other vars override/supplement discovered values
# you - gets priority matching and appears in intros
HOST_USER=
HOST_NAME=
HOST_EMAIL=
HOST_GITHUB= # defaults to HOST_USER
HOST_MASTODON= # format: @user@instance
HOST_GITHUB=
HOST_MASTODON=
HOST_REDDIT=
HOST_LEMMY= # format: @user@instance
HOST_LEMMY=
HOST_LOBSTERS=
HOST_MATRIX= # format: @user:server
HOST_DISCORD= # user id
HOST_BLUESKY= # format: handle.bsky.social
HOST_MATRIX=
HOST_DISCORD=
HOST_BLUESKY=
HOST_LOCATION=
HOST_INTERESTS= # comma separated: intentional-community,cooperative,solarpunk
HOST_LOOKING_FOR= # what you're looking for in matches
HOST_INTERESTS=
HOST_LOOKING_FOR=

View file

@ -18,11 +18,11 @@ we lift them up. we show them what's possible. we connect them to people who GET
## what it does
1. **scouts** - discovers humans across platforms (github, reddit, mastodon, lemmy, discord, lobsters, bluesky, matrix)
1. **scouts** - discovers humans across platforms (github, mastodon, lemmy, reddit, lobsters, bluesky, matrix, discord, and self-hosted git forges)
2. **analyzes** - scores them for values alignment AND lost builder potential
3. **matches** - pairs aligned builders together, or pairs lost builders with inspiring active ones
4. **drafts** - uses LLM to write genuine, personalized intros
5. **delivers** - sends via email, mastodon DM, bluesky DM, matrix DM, discord DM, or github issue
5. **delivers** - sends via the channel they're most active on (email, mastodon, bluesky, matrix, discord, github issue, or forge issue)
fully autonomous. no manual review. self-sustaining pipe.
@ -45,6 +45,40 @@ people who have potential but haven't started yet, gave up, or are struggling:
lost builders don't get matched to each other (both need energy). they get matched to ACTIVE builders who can inspire them.
## discovery sources
| platform | method |
|----------|--------|
| github | API + profile scraping |
| mastodon | public API |
| lemmy | federation API |
| reddit | public API |
| lobsters | web scraping |
| bluesky | AT Protocol |
| matrix | room membership |
| discord | bot API |
| **gitea/forgejo** | instance API |
| **gitlab CE** | instance API |
| **gogs** | instance API |
| **sourcehut** | web scraping |
| **codeberg** | gitea API |
self-hosted git forge users = highest signal. they actually selfhost.
## delivery methods
connectd picks the best contact method based on **activity** - not a static priority list. if someone's most active on mastodon, they get a mastodon DM. if that fails, it falls back to their second-most-active platform.
| method | notes |
|--------|-------|
| email | extracted from profiles, commits, websites |
| mastodon DM | if they allow DMs |
| bluesky DM | via AT Protocol |
| matrix DM | creates DM room |
| discord DM | via bot |
| github issue | on their most active repo |
| **forge issue** | gitea/forgejo/gitlab/gogs repos |
## quick start
```bash
@ -71,7 +105,7 @@ python daemon.py # live mode
# discovery
python cli.py scout # all platforms
python cli.py scout --github # github only
python cli.py scout --reddit --lemmy # specific platforms
python cli.py scout --forges # self-hosted git forges
python cli.py scout --user octocat # deep scrape one user
# matching
@ -97,27 +131,24 @@ python cli.py daemon --oneshot # run once then exit
python cli.py status # show stats
```
## docker
## distributed mode
multiple connectd instances can coordinate via a central API to:
- share discovered humans
- avoid duplicate outreach
- claim/release outreach targets
```bash
# build
docker build -t connectd .
# run daemon
docker compose up -d
# run one-off commands
docker compose run --rm connectd python cli.py scout
docker compose run --rm connectd python cli.py status
# set in .env
CONNECTD_CENTRAL_API=https://your-central-api.com
CONNECTD_API_KEY=your-api-key
CONNECTD_INSTANCE_ID=instance-name
CONNECTD_INSTANCE_IP=your-ip
```
## environment variables
copy `.env.example` to `.env` and fill in your values:
```bash
cp .env.example .env
```
copy `.env.example` to `.env` and fill in your values.
### required
@ -132,7 +163,7 @@ cp .env.example .env
| `GITHUB_TOKEN` | higher rate limits for github API |
| `DISCORD_BOT_TOKEN` | discord bot token for server access |
| `DISCORD_TARGET_SERVERS` | comma-separated server IDs to scout |
| `LEMMY_INSTANCE` | your lemmy instance (e.g. `lemmy.ml`) |
| `LEMMY_INSTANCE` | your lemmy instance |
| `LEMMY_USERNAME` | lemmy username for auth |
| `LEMMY_PASSWORD` | lemmy password for auth |
@ -141,11 +172,11 @@ cp .env.example .env
| variable | description |
|----------|-------------|
| `MASTODON_TOKEN` | mastodon access token |
| `MASTODON_INSTANCE` | your mastodon instance (e.g. `mastodon.social`) |
| `BLUESKY_HANDLE` | bluesky handle (e.g. `you.bsky.social`) |
| `MASTODON_INSTANCE` | your mastodon instance |
| `BLUESKY_HANDLE` | bluesky handle |
| `BLUESKY_APP_PASSWORD` | bluesky app password |
| `MATRIX_HOMESERVER` | matrix homeserver URL |
| `MATRIX_USER_ID` | matrix user ID (e.g. `@bot:matrix.org`) |
| `MATRIX_USER_ID` | matrix user ID |
| `MATRIX_ACCESS_TOKEN` | matrix access token |
| `SMTP_HOST` | email server host |
| `SMTP_PORT` | email server port (default 465) |
@ -153,15 +184,32 @@ cp .env.example .env
| `SMTP_PASS` | email password |
| `FROM_EMAIL` | from address for emails |
you need at least ONE delivery method configured for intros to be sent.
### forge tokens
for creating issues on self-hosted git forges:
| variable | description |
|----------|-------------|
| `CODEBERG_TOKEN` | codeberg.org access token |
| `GITEA_TOKEN_<instance>` | gitea/forgejo token (e.g. `GITEA_TOKEN_git_example_com`) |
| `GITLAB_TOKEN_<instance>` | gitlab CE token (e.g. `GITLAB_TOKEN_gitlab_example_com`) |
instance names use underscores: `git.example.com``GITEA_TOKEN_git_example_com`
for ports: `192.168.1.8:3000``GITEA_TOKEN_192_168_1_8_3000`
## architecture
```
scoutd/ - discovery modules (one per platform)
forges.py - gitea/forgejo/gitlab/gogs/sourcehut scraper
handles.py - cross-platform handle discovery
matchd/ - matching + fingerprinting logic
introd/ - intro drafting + delivery
deliver.py - multi-channel delivery with fallback
groq_draft.py - LLM-powered intro generation
db/ - sqlite storage
central_client.py - distributed coordination
config.py - central configuration
daemon.py - continuous runner
cli.py - command line interface
@ -171,8 +219,8 @@ cli.py - command line interface
- scout: every 4 hours
- match: every 1 hour
- stranger intros: every 2 hours (max 20/day)
- lost builder intros: every 6 hours (max 5/day)
- intros: every 2 hours (max 1000/day)
- lost builder intros: every 6 hours (max 100/day)
## forking

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

1160
api.py

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,976 @@
#!/usr/bin/env python3
"""
connectd/api.py - REST API for stats and control
exposes daemon stats for home assistant integration.
runs on port 8099 by default.
"""
import os
import json
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler
from datetime import datetime
from db import Database
from db.users import get_priority_users, get_priority_user_matches, get_priority_user
API_PORT = int(os.environ.get('CONNECTD_API_PORT', 8099))
# shared state (updated by daemon)
_daemon_state = {
'running': False,
'dry_run': False,
'last_scout': None,
'last_match': None,
'last_intro': None,
'last_lost': None,
'intros_today': 0,
'lost_intros_today': 0,
'started_at': None,
}
def update_daemon_state(state_dict):
"""update shared daemon state (called by daemon)"""
global _daemon_state
_daemon_state.update(state_dict)
def get_daemon_state():
"""get current daemon state"""
return _daemon_state.copy()
DASHBOARD_HTML = """<!DOCTYPE html>
<html>
<head>
<title>connectd</title>
<meta charset=utf-8>
<link rel="icon" type="image/png" href="/favicon.png">
<style>
*{box-sizing:border-box;margin:0;padding:0}
body{font-family:monospace;background:#0a0a0f;color:#0f8;padding:20px}
h1{color:#c792ea;margin-bottom:15px}
h2{color:#82aaff;margin:15px 0 10px}
.stats{display:flex;gap:12px;flex-wrap:wrap;margin-bottom:15px}
.stat{background:#1a1a2e;padding:10px 16px;border-radius:6px;border:1px solid #333;text-align:center}
.stat b{font-size:1.6em;color:#c792ea;display:block}
.stat small{color:#666;font-size:.75em}
.card{background:#1a1a2e;border:1px solid #333;border-radius:6px;padding:10px;margin-bottom:8px;cursor:pointer}
.card:hover{border-color:#0f8}
.card-hdr{display:flex;justify-content:space-between;color:#82aaff}
.score{background:#2a2a4e;padding:2px 8px;border-radius:4px;color:#c792ea}
.body{background:#0d0d15;padding:10px;border-radius:4px;white-space:pre-wrap;color:#ddd;margin-top:8px;font-size:.85em}
.meta{color:#666;font-size:.75em;margin-top:5px}
.m{display:inline-block;padding:1px 5px;border-radius:3px;font-size:.75em}
.m-email{background:#2d4a2d;color:#8f8}
.m-mastodon{background:#3d3a5c;color:#c792ea}
.m-new{background:#2d3a4a;color:#82aaff}
.tabs{margin-bottom:12px}
.tab{background:#1a1a2e;border:1px solid #333;color:#0f8;padding:6px 14px;cursor:pointer;font-family:monospace;font-size:.9em}
.tab.on{background:#2a2a4e;border-color:#0f8}
.pnl{display:none}
.pnl.on{display:block}
.btn{background:#0f8;color:#0a0a0f;border:none;padding:6px 14px;cursor:pointer;font-family:monospace;font-weight:bold;margin-left:10px;font-size:.9em}
.err{color:#f66}
a{color:#82aaff}
.status{font-size:.85em;color:#888;margin-bottom:10px}
.status b{color:#0f8}
.cached{color:#555;font-size:.7em}
.to{color:#f7c}
.about{color:#82aaff}
</style>
</head>
<body>
<h1>connectd <a href="https://github.com/sudoxnym/connectd" style="font-size:.5em;color:#82aaff">repo</a> <a href="https://github.com/connectd-daemon" style="font-size:.5em;color:#f7c">org</a></h1>
<div class="status" id="status"></div>
<div class="stats" id="stats"></div>
<div class="tabs">
<button class="tab on" onclick="show('host')">you</button>
<button class="tab" onclick="show('queue')">queue</button>
<button class="tab" onclick="show('sent')">sent</button>
<button class="tab" onclick="show('failed')">failed</button>
<button class="btn" onclick="load()">refresh</button>
</div>
<div id="host" class="pnl on"></div>
<div id="queue" class="pnl"></div>
<div id="sent" class="pnl"></div>
<div id="failed" class="pnl"></div>
<script>
async function loadStats(){
var sr=await fetch('/api/stats'),hr=await fetch('/api/host');
var s=await sr.json(),h=await hr.json();
var up=h.uptime_seconds?Math.floor(h.uptime_seconds/3600)+'h '+Math.floor((h.uptime_seconds%3600)/60)+'m':'0m';
document.getElementById('status').innerHTML='daemon <b>'+(h.running?'ON':'OFF')+'</b> | '+up+' | '+h.intros_today+' today';
document.getElementById('stats').innerHTML='<div class="stat"><b>'+s.total_humans+'</b><small>humans</small></div><div class="stat"><b>'+s.total_matches+'</b><small>matches</small></div><div class="stat"><b>'+h.score_90_plus+'</b><small>90+</small></div><div class="stat"><b>'+h.score_80_89+'</b><small>80+</small></div><div class="stat"><b>'+h.matches_pending+'</b><small>queue</small></div><div class="stat"><b>'+s.sent_intros+'</b><small>sent</small></div>';
}
async function loadHost(){
var r=await fetch('/api/host_matches?limit=20'),d=await r.json();
var c='<h2>your matches ('+d.host+')</h2>';
c+='<p style="color:#666;font-size:.8em;margin-bottom:10px">each match = 2 intros (one to you, one to them)</p>';
if(!d.matches||!d.matches.length){c+='<div class="meta">no matches yet</div>';}
for(var i=0;i<(d.matches||[]).length;i++){
var m=d.matches[i];
c+='<div class="card" onclick="prevHost('+m.id+',1,this)"><div class="card-hdr"><span class="to">TO: you</span><span class="score">'+m.score+'</span></div><div class="meta"><span class="about">ABOUT: '+m.other_user+'</span> ('+m.other_platform+')</div><div class="meta">'+(m.reasons||[]).slice(0,2).join(', ')+'</div><div id="h'+m.id+'a" class="body" style="display:none"></div></div>';
c+='<div class="card" onclick="prevHost('+m.id+',2,this)"><div class="card-hdr"><span class="to">TO: '+m.other_user+'</span><span class="score">'+m.score+'</span></div><div class="meta"><span class="about">ABOUT: you</span></div><div class="meta">'+(m.contact||'no contact')+'</div><div id="h'+m.id+'b" class="body" style="display:none"></div></div>';
}
document.getElementById('host').innerHTML=c;
}
async function prevHost(id,dir,card){
var el=document.getElementById('h'+id+(dir==1?'a':'b'));
if(el.style.display!='none'){el.style.display='none';return;}
el.innerHTML='loading...';el.style.display='block';
var r=await fetch('/api/preview_host_draft?id='+id+'&dir='+(dir==1?'to_you':'to_them'));
var d=await r.json();
if(d.error){el.innerHTML='<span class="err">'+d.error+'</span>';}
else{el.innerHTML='<b>SUBJ:</b> '+d.subject+(d.cached?' <span class="cached">(cached)</span>':'')+'<br><br>'+d.draft;}
}
async function loadQueue(){
var r=await fetch('/api/pending_matches?limit=40'),d=await r.json();
var c='<h2>outreach queue</h2>';
if(!d.matches||!d.matches.length){c+='<div class="meta">empty</div>';}
for(var i=0;i<(d.matches||[]).length;i++){
var p=d.matches[i];
c+='<div class="card" onclick="prevQ('+p.id+',this)"><div class="card-hdr"><span class="to">TO: '+p.to_user+'</span><span class="score">'+p.score+'</span></div><div class="meta"><span class="about">ABOUT: '+p.about_user+'</span> | <span class="m m-'+(p.method||'new')+'">'+(p.method||'?')+'</span> '+(p.contact||'')+'</div><div id="q'+p.id+'_'+i+'" class="body" style="display:none"></div></div>';
}
document.getElementById('queue').innerHTML=c;
}
async function prevQ(id,card){
var el=card.querySelector('.body');
if(el.style.display!='none'){el.style.display='none';return;}
el.innerHTML='loading...';el.style.display='block';
var r=await fetch('/api/preview_draft?id='+id);
var d=await r.json();
if(d.error){el.innerHTML='<span class="err">'+d.error+'</span>';}
else{el.innerHTML='<b>TO:</b> '+d.to+'\n<b>ABOUT:</b> '+d.about+'\n<b>SUBJ:</b> '+d.subject+(d.cached?' <span class="cached">(cached)</span>':'')+'<br><br>'+d.draft;}
}
async function loadSent(){var r=await fetch('/api/sent_intros'),d=await r.json();var c='<h2>sent</h2>';for(var i=0;i<(d.sent||[]).length;i++){var s=d.sent[i];c+='<div class="card"><div class="card-hdr">TO: '+s.recipient_id+' <span class="m m-'+s.method+'">'+s.method+'</span></div><div class="body">'+(s.draft||'-')+'</div><div class="meta">'+s.timestamp+'</div></div>';}document.getElementById('sent').innerHTML=c;}
async function loadFailed(){var r=await fetch('/api/failed_intros'),d=await r.json();var c='<h2>failed</h2>';for(var i=0;i<(d.failed||[]).length;i++){var f=d.failed[i];c+='<div class="card"><div class="card-hdr">'+f.recipient_id+'</div><div class="meta err">'+f.error+'</div></div>';}document.getElementById('failed').innerHTML=c;}
function show(n){document.querySelectorAll('.pnl').forEach(function(e){e.classList.remove('on')});document.querySelectorAll('.tab').forEach(function(e){e.classList.remove('on')});document.getElementById(n).classList.add('on');event.target.classList.add('on');}
function load(){loadStats();loadHost();loadQueue();loadSent();loadFailed();}
load();setInterval(load,60000);
</script>
</body>
</html>"""
# draft cache - stores generated drafts so they dont regenerate
_draft_cache = {}
def get_cached_draft(match_id, match_type='match'):
key = f"{match_type}:{match_id}"
return _draft_cache.get(key)
def cache_draft(match_id, draft_data, match_type='match'):
key = f"{match_type}:{match_id}"
_draft_cache[key] = draft_data
class APIHandler(BaseHTTPRequestHandler):
"""simple REST API handler"""
def log_message(self, format, *args):
"""suppress default logging"""
pass
def _send_json(self, data, status=200):
"""send JSON response"""
self.send_response(status)
self.send_header('Content-Type', 'application/json')
self.send_header('Access-Control-Allow-Origin', '*')
self.end_headers()
self.wfile.write(json.dumps(data).encode())
def do_GET(self):
"""handle GET requests"""
path = self.path.split('?')[0]
if path == '/favicon.png' or path == '/favicon.ico':
self._handle_favicon()
elif path == '/' or path == '/dashboard':
self._handle_dashboard()
elif path == '/api/stats':
self._handle_stats()
elif path == '/api/host':
self._handle_host()
elif path == '/api/host_matches':
self._handle_host_matches()
elif path == '/api/your_matches':
self._handle_your_matches()
elif path == '/api/preview_match_draft':
self._handle_preview_match_draft()
elif path == '/api/preview_host_draft':
self._handle_preview_host_draft()
elif path == '/api/preview_draft':
self._handle_preview_draft()
elif path == '/api/pending_about_you':
self._handle_pending_about_you()
elif path == '/api/pending_to_you':
self._handle_pending_to_you()
elif path == '/api/pending_matches':
self._handle_pending_matches()
elif path == '/api/sent_intros':
self._handle_sent_intros()
elif path == '/api/failed_intros':
self._handle_failed_intros()
elif path == '/api/health':
self._handle_health()
elif path == '/api/state':
self._handle_state()
elif path == '/api/priority_matches':
self._handle_priority_matches()
elif path == '/api/top_humans':
self._handle_top_humans()
elif path == '/api/user':
self._handle_user()
else:
self._send_json({'error': 'not found'}, 404)
def _handle_favicon(self):
from pathlib import Path
fav = Path('/app/data/favicon.png')
if fav.exists():
self.send_response(200)
self.send_header('Content-Type', 'image/png')
self.end_headers()
self.wfile.write(fav.read_bytes())
else:
self.send_response(404)
self.end_headers()
def _handle_dashboard(self):
self.send_response(200)
self.send_header("Content-Type", "text/html")
self.end_headers()
self.wfile.write(DASHBOARD_HTML.encode())
def _handle_sent_intros(self):
from pathlib import Path
log_path = Path("/app/data/delivery_log.json")
sent = []
if log_path.exists():
with open(log_path) as f:
log = json.load(f)
sent = log.get("sent", [])[-20:]
sent.reverse()
self._send_json({"sent": sent})
def _handle_failed_intros(self):
from pathlib import Path
log_path = Path("/app/data/delivery_log.json")
failed = []
if log_path.exists():
with open(log_path) as f:
log = json.load(f)
failed = log.get("failed", [])
self._send_json({"failed": failed})
def _handle_host(self):
"""daemon status and match stats"""
import sqlite3
state = get_daemon_state()
try:
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("SELECT COUNT(*) FROM matches WHERE status='pending' AND overlap_score >= 60")
pending = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE status='intro_sent'")
sent = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE status='rejected'")
rejected = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches")
total = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 90")
s90 = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 80 AND overlap_score < 90")
s80 = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 70 AND overlap_score < 80")
s70 = c.fetchone()[0]
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 60 AND overlap_score < 70")
s60 = c.fetchone()[0]
conn.close()
except:
pending = sent = rejected = total = s90 = s80 = s70 = s60 = 0
uptime = None
if state.get('started_at'):
try:
start = datetime.fromisoformat(state['started_at']) if isinstance(state['started_at'], str) else state['started_at']
uptime = int((datetime.now() - start).total_seconds())
except: pass
self._send_json({
'running': state.get('running', False), 'dry_run': state.get('dry_run', False),
'uptime_seconds': uptime, 'intros_today': state.get('intros_today', 0),
'matches_pending': pending, 'matches_sent': sent, 'matches_rejected': rejected, 'matches_total': total,
'score_90_plus': s90, 'score_80_89': s80, 'score_70_79': s70, 'score_60_69': s60,
})
def _handle_your_matches(self):
"""matches involving the host - shows both directions"""
import sqlite3
import json as j
from db.users import get_priority_users
limit = 15
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('limit='):
try: limit = int(p.split('=')[1])
except: pass
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'matches': [], 'host': None})
db.close()
return
host = users[0]
host_name = host.get('github') or host.get('name')
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT m.id, m.overlap_score, m.overlap_reasons, m.status,
h1.username, h1.platform, h1.contact,
h2.username, h2.platform, h2.contact
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE (h1.username = ? OR h2.username = ?)
AND m.status = 'pending' AND m.overlap_score >= 60
ORDER BY m.overlap_score DESC LIMIT ?""", (host_name, host_name, limit))
matches = []
for row in c.fetchall():
if row[4] == host_name:
other_user, other_platform = row[7], row[8]
other_contact = j.loads(row[9]) if row[9] else {}
else:
other_user, other_platform = row[4], row[5]
other_contact = j.loads(row[6]) if row[6] else {}
reasons = j.loads(row[2]) if row[2] else []
matches.append({
'id': row[0], 'score': int(row[1]), 'reasons': reasons,
'status': row[3], 'other_user': other_user, 'other_platform': other_platform,
'contact': other_contact.get('email') or other_contact.get('mastodon') or ''
})
conn.close()
db.close()
self._send_json({'host': host_name, 'matches': matches})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_preview_match_draft(self):
"""preview draft for a match - dir=to_you or to_them"""
import sqlite3
import json as j
from introd.groq_draft import draft_intro_with_llm
from db.users import get_priority_users
match_id = None
direction = 'to_you'
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('id='):
try: match_id = int(p.split('=')[1])
except: pass
if p.startswith('dir='):
direction = p.split('=')[1]
if not match_id:
self._send_json({'error': 'need ?id=match_id'}, 400)
return
cache_key = f"{match_id}_{direction}"
cached = get_cached_draft(cache_key, 'match')
if cached:
cached['cached'] = True
self._send_json(cached)
return
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'error': 'no priority user'}, 404)
db.close()
return
host = users[0]
host_name = host.get('github') or host.get('name')
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
h2.username, h2.platform, h2.contact, h2.extra,
m.overlap_score, m.overlap_reasons
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE m.id = ?""", (match_id,))
row = c.fetchone()
conn.close()
db.close()
if not row:
self._send_json({'error': 'match not found'}, 404)
return
human_a = {'username': row[0], 'platform': row[1],
'contact': j.loads(row[2]) if row[2] else {},
'extra': j.loads(row[3]) if row[3] else {}}
human_b = {'username': row[4], 'platform': row[5],
'contact': j.loads(row[6]) if row[6] else {},
'extra': j.loads(row[7]) if row[7] else {}}
reasons = j.loads(row[9]) if row[9] else []
if human_a['username'] == host_name:
host_human, other_human = human_a, human_b
else:
host_human, other_human = human_b, human_a
if direction == 'to_you':
match_data = {'human_a': host_human, 'human_b': other_human,
'overlap_score': row[8], 'overlap_reasons': reasons}
recipient_name = host_name
about_name = other_human['username']
else:
match_data = {'human_a': other_human, 'human_b': host_human,
'overlap_score': row[8], 'overlap_reasons': reasons}
recipient_name = other_human['username']
about_name = host_name
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
if error:
self._send_json({'error': error}, 500)
return
response = {
'match_id': match_id,
'direction': direction,
'to': recipient_name,
'about': about_name,
'subject': result.get('subject'),
'draft': result.get('draft'),
'score': row[8],
'cached': False,
}
cache_draft(cache_key, response, 'match')
self._send_json(response)
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_host_matches(self):
"""matches for priority user"""
import sqlite3
import json as j
from db.users import get_priority_users
limit = 20
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('limit='):
try: limit = int(p.split('=')[1])
except: pass
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'matches': [], 'host': None})
db.close()
return
host = users[0]
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT pm.id, pm.overlap_score, pm.overlap_reasons, pm.status, h.username, h.platform, h.contact
FROM priority_matches pm JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ? ORDER BY pm.overlap_score DESC LIMIT ?""", (host['id'], limit))
matches = []
for row in c.fetchall():
reasons = j.loads(row[2]) if row[2] else []
contact = j.loads(row[6]) if row[6] else {}
matches.append({'id': row[0], 'score': int(row[1]), 'reasons': reasons, 'status': row[3],
'other_user': row[4], 'other_platform': row[5],
'contact': contact.get('email') or contact.get('mastodon') or contact.get('github') or ''})
conn.close()
db.close()
self._send_json({'host': host.get('github') or host.get('name'), 'matches': matches})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_preview_host_draft(self):
"""preview draft for a priority match - dir=to_you or to_them"""
import sqlite3
import json as j
from introd.groq_draft import draft_intro_with_llm
from db.users import get_priority_users
match_id = None
direction = 'to_you'
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('id='):
try: match_id = int(p.split('=')[1])
except: pass
if p.startswith('dir='):
direction = p.split('=')[1]
if not match_id:
self._send_json({'error': 'need ?id=match_id'}, 400)
return
cache_key = f"host_{match_id}_{direction}"
cached = get_cached_draft(cache_key, 'host')
if cached:
cached['cached'] = True
self._send_json(cached)
return
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'error': 'no priority user'}, 404)
db.close()
return
host = users[0]
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
# Get the matched human from priority_matches
c.execute("""SELECT h.username, h.platform, h.contact, h.extra, pm.overlap_score, pm.overlap_reasons
FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.id = ?""", (match_id,))
row = c.fetchone()
conn.close()
db.close()
if not row:
self._send_json({'error': 'match not found'}, 404)
return
# The matched person (who we found for the host)
other = {'username': row[0], 'platform': row[1],
'contact': j.loads(row[2]) if row[2] else {},
'extra': j.loads(row[3]) if row[3] else {}}
# Build host as human_a (recipient), other as human_b (subject)
host_human = {'username': host.get('github') or host.get('name'),
'platform': 'priority',
'contact': {'email': host.get('email'), 'mastodon': host.get('mastodon'), 'github': host.get('github')},
'extra': {'bio': host.get('bio'), 'interests': host.get('interests')}}
reasons = j.loads(row[5]) if row[5] else []
match_data = {'human_a': host_human, 'human_b': other,
'overlap_score': row[4], 'overlap_reasons': reasons}
# direction determines who gets the intro
if direction == 'to_you':
# intro TO host ABOUT other
match_data = {'human_a': host_human, 'human_b': other,
'overlap_score': row[4], 'overlap_reasons': reasons}
to_name = host.get('github') or host.get('name')
about_name = other['username']
else:
# intro TO other ABOUT host
match_data = {'human_a': other, 'human_b': host_human,
'overlap_score': row[4], 'overlap_reasons': reasons}
to_name = other['username']
about_name = host.get('github') or host.get('name')
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
if error:
self._send_json({'error': error}, 500)
return
cache_key = f"host_{match_id}_{direction}"
response = {
'match_id': match_id,
'direction': direction,
'to': to_name,
'about': about_name,
'subject': result.get('subject'),
'draft': result.get('draft'),
'score': row[4],
'cached': False,
}
cache_draft(cache_key, response, 'host')
self._send_json(response)
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_preview_draft(self):
import sqlite3
import json as j
from introd.groq_draft import draft_intro_with_llm
match_id = None
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('id='):
try: match_id = int(p.split('=')[1])
except: pass
if not match_id:
self._send_json({'error': 'need ?id=match_id'}, 400)
return
# check cache first
cached = get_cached_draft(match_id, 'queue')
if cached:
cached['cached'] = True
self._send_json(cached)
return
try:
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
h2.username, h2.platform, h2.contact, h2.extra,
m.overlap_score, m.overlap_reasons
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE m.id = ?""", (match_id,))
row = c.fetchone()
conn.close()
if not row:
self._send_json({'error': 'match not found'}, 404)
return
human_a = {'username': row[0], 'platform': row[1],
'contact': j.loads(row[2]) if row[2] else {},
'extra': j.loads(row[3]) if row[3] else {}}
human_b = {'username': row[4], 'platform': row[5],
'contact': j.loads(row[6]) if row[6] else {},
'extra': j.loads(row[7]) if row[7] else {}}
reasons = j.loads(row[9]) if row[9] else []
match_data = {'human_a': human_a, 'human_b': human_b,
'overlap_score': row[8], 'overlap_reasons': reasons}
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
if error:
self._send_json({'error': error}, 500)
return
response = {
'match_id': match_id,
'to': human_a['username'],
'about': human_b['username'],
'subject': result.get('subject'),
'draft': result.get('draft'),
'score': row[8],
'cached': False,
}
cache_draft(match_id, response, 'queue')
self._send_json(response)
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_pending_about_you(self):
"""pending intros where host is human_b (being introduced to others)"""
import sqlite3
import json as j
from db.users import get_priority_users
limit = 10
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('limit='):
try: limit = int(p.split('=')[1])
except: pass
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'matches': []})
db.close()
return
host = users[0]
host_name = host.get('github') or host.get('name')
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT m.id, h1.username, h1.platform, h1.contact,
m.overlap_score, m.overlap_reasons
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE h2.username = ? AND m.status = 'pending' AND m.overlap_score >= 60
ORDER BY m.overlap_score DESC LIMIT ?""", (host_name, limit))
matches = []
for row in c.fetchall():
contact = j.loads(row[3]) if row[3] else {}
reasons = j.loads(row[5]) if row[5] else []
method = 'email' if contact.get('email') else ('mastodon' if contact.get('mastodon') else None)
matches.append({'id': row[0], 'to_user': row[1], 'to_platform': row[2],
'score': int(row[4]), 'reasons': reasons[:3], 'method': method,
'contact': contact.get('email') or contact.get('mastodon') or ''})
conn.close()
db.close()
self._send_json({'matches': matches})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_pending_to_you(self):
"""pending intros where host is human_a (receiving intro about others)"""
import sqlite3
import json as j
from db.users import get_priority_users
limit = 20
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('limit='):
try: limit = int(p.split('=')[1])
except: pass
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({'matches': []})
db.close()
return
host = users[0]
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT pm.id, h.username, h.platform, pm.overlap_score, pm.overlap_reasons
FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ? AND pm.status IN ('new', 'pending')
ORDER BY pm.overlap_score DESC LIMIT ?""", (host['id'], limit))
matches = []
for row in c.fetchall():
reasons = j.loads(row[4]) if row[4] else []
matches.append({'id': row[0], 'about_user': row[1], 'about_platform': row[2],
'score': int(row[3]), 'reasons': reasons[:3]})
conn.close()
db.close()
self._send_json({'matches': matches})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_pending_matches(self):
"""pending matches - returns BOTH directions for each match"""
import sqlite3
import json as j
limit = 30
if '?' in self.path:
for p in self.path.split('?')[1].split('&'):
if p.startswith('limit='):
try: limit = int(p.split('=')[1])
except: pass
try:
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT m.id, h1.username, h1.platform, h1.contact,
h2.username, h2.platform, h2.contact, m.overlap_score, m.overlap_reasons
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE m.status = 'pending' AND m.overlap_score >= 60
ORDER BY m.overlap_score DESC LIMIT ?""", (limit // 2,))
matches = []
for row in c.fetchall():
contact_a = j.loads(row[3]) if row[3] else {}
contact_b = j.loads(row[6]) if row[6] else {}
reasons = j.loads(row[8]) if row[8] else []
# direction 1: TO human_a ABOUT human_b
method_a = 'email' if contact_a.get('email') else ('mastodon' if contact_a.get('mastodon') else None)
matches.append({'id': row[0], 'to_user': row[1], 'about_user': row[4],
'score': int(row[7]), 'reasons': reasons[:3], 'method': method_a,
'contact': contact_a.get('email') or contact_a.get('mastodon') or ''})
# direction 2: TO human_b ABOUT human_a
method_b = 'email' if contact_b.get('email') else ('mastodon' if contact_b.get('mastodon') else None)
matches.append({'id': row[0], 'to_user': row[4], 'about_user': row[1],
'score': int(row[7]), 'reasons': reasons[:3], 'method': method_b,
'contact': contact_b.get('email') or contact_b.get('mastodon') or ''})
conn.close()
self._send_json({'matches': matches})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_stats(self):
"""return database statistics"""
try:
db = Database()
stats = db.stats()
db.close()
self._send_json(stats)
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_health(self):
"""return daemon health status"""
state = get_daemon_state()
health = {
'status': 'running' if state['running'] else 'stopped',
'dry_run': state['dry_run'],
'uptime_seconds': None,
}
if state['started_at']:
uptime = datetime.now() - datetime.fromisoformat(state['started_at'])
health['uptime_seconds'] = int(uptime.total_seconds())
self._send_json(health)
def _handle_state(self):
"""return full daemon state"""
state = get_daemon_state()
# convert datetimes to strings
for key in ['last_scout', 'last_match', 'last_intro', 'last_lost', 'started_at']:
if state[key] and isinstance(state[key], datetime):
state[key] = state[key].isoformat()
self._send_json(state)
def _handle_priority_matches(self):
"""return priority matches for HA sensor"""
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({
'count': 0,
'new_count': 0,
'top_matches': [],
})
db.close()
return
# get matches for first priority user (host)
user = users[0]
matches = get_priority_user_matches(db.conn, user['id'], limit=10)
new_count = sum(1 for m in matches if m.get('status') == 'new')
top_matches = []
for m in matches[:5]:
overlap_reasons = m.get('overlap_reasons', '[]')
if isinstance(overlap_reasons, str):
import json as json_mod
overlap_reasons = json_mod.loads(overlap_reasons) if overlap_reasons else []
top_matches.append({
'username': m.get('username'),
'platform': m.get('platform'),
'score': m.get('score', 0),
'overlap_score': m.get('overlap_score', 0),
'reasons': overlap_reasons[:3],
'url': m.get('url'),
'status': m.get('status', 'new'),
})
db.close()
self._send_json({
'count': len(matches),
'new_count': new_count,
'top_matches': top_matches,
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_top_humans(self):
"""return top scoring humans for HA sensor"""
try:
db = Database()
humans = db.get_all_humans(min_score=50, limit=5)
top_humans = []
for h in humans:
contact = h.get('contact', '{}')
if isinstance(contact, str):
import json as json_mod
contact = json_mod.loads(contact) if contact else {}
signals = h.get('signals', '[]')
if isinstance(signals, str):
import json as json_mod
signals = json_mod.loads(signals) if signals else []
top_humans.append({
'username': h.get('username'),
'platform': h.get('platform'),
'score': h.get('score', 0),
'name': h.get('name'),
'signals': signals[:5],
'contact_method': 'email' if contact.get('email') else
'mastodon' if contact.get('mastodon') else
'matrix' if contact.get('matrix') else 'manual',
})
db.close()
self._send_json({
'count': len(humans),
'top_humans': top_humans,
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_user(self):
"""return priority user info for HA sensor"""
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({
'configured': False,
'score': 0,
'signals': [],
'match_count': 0,
})
db.close()
return
user = users[0]
signals = user.get('signals', '[]')
if isinstance(signals, str):
import json as json_mod
signals = json_mod.loads(signals) if signals else []
interests = user.get('interests', '[]')
if isinstance(interests, str):
import json as json_mod
interests = json_mod.loads(interests) if interests else []
matches = get_priority_user_matches(db.conn, user['id'], limit=100)
db.close()
self._send_json({
'configured': True,
'name': user.get('name'),
'github': user.get('github'),
'mastodon': user.get('mastodon'),
'reddit': user.get('reddit'),
'lobsters': user.get('lobsters'),
'matrix': user.get('matrix'),
'lemmy': user.get('lemmy'),
'discord': user.get('discord'),
'bluesky': user.get('bluesky'),
'score': user.get('score', 0),
'signals': signals[:10],
'interests': interests,
'location': user.get('location'),
'bio': user.get('bio'),
'match_count': len(matches),
'new_match_count': sum(1 for m in matches if m.get('status') == 'new'),
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def run_api_server():
"""run the API server in a thread"""
server = HTTPServer(('0.0.0.0', API_PORT), APIHandler)
print(f"connectd api running on port {API_PORT}")
server.serve_forever()
def start_api_thread():
"""start API server in background thread"""
thread = threading.Thread(target=run_api_server, daemon=True)
thread.start()
return thread
if __name__ == '__main__':
# standalone mode for testing
print(f"starting connectd api on port {API_PORT}...")
run_api_server()

File diff suppressed because it is too large Load diff

608
api_orig.py Normal file
View file

@ -0,0 +1,608 @@
#!/usr/bin/env python3
"""
connectd/api.py - REST API for stats and control
exposes daemon stats for home assistant integration.
runs on port 8099 by default.
"""
import os
import json
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler
from datetime import datetime
from db import Database
from db.users import get_priority_users, get_priority_user_matches, get_priority_user
API_PORT = int(os.environ.get('CONNECTD_API_PORT', 8099))
# shared state (updated by daemon)
_daemon_state = {
'running': False,
'dry_run': False,
'last_scout': None,
'last_match': None,
'last_intro': None,
'last_lost': None,
'intros_today': 0,
'lost_intros_today': 0,
'started_at': None,
}
def update_daemon_state(state_dict):
"""update shared daemon state (called by daemon)"""
global _daemon_state
_daemon_state.update(state_dict)
def get_daemon_state():
"""get current daemon state"""
return _daemon_state.copy()
class APIHandler(BaseHTTPRequestHandler):
"""simple REST API handler"""
def log_message(self, format, *args):
"""suppress default logging"""
pass
def _send_json(self, data, status=200):
"""send JSON response"""
self.send_response(status)
self.send_header('Content-Type', 'application/json')
self.send_header('Access-Control-Allow-Origin', '*')
self.end_headers()
self.wfile.write(json.dumps(data).encode())
def do_GET(self):
"""handle GET requests"""
path = self.path.split('?')[0] # strip query params for routing
if path == '/api/stats':
self._handle_stats()
elif path == '/api/health':
self._handle_health()
elif path == '/api/state':
self._handle_state()
elif path == '/api/priority_matches':
self._handle_priority_matches()
elif path == '/api/top_humans':
self._handle_top_humans()
elif path == '/api/user':
self._handle_user()
elif path == '/dashboard' or path == '/':
self._handle_dashboard()
elif path == '/api/preview_intros':
self._handle_preview_intros()
elif path == '/api/sent_intros':
self._handle_sent_intros()
elif path == '/api/failed_intros':
self._handle_failed_intros()
else:
self._send_json({'error': 'not found'}, 404)
def _handle_stats(self):
"""return database statistics"""
try:
db = Database()
stats = db.stats()
db.close()
self._send_json(stats)
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_health(self):
"""return daemon health status"""
state = get_daemon_state()
health = {
'status': 'running' if state['running'] else 'stopped',
'dry_run': state['dry_run'],
'uptime_seconds': None,
}
if state['started_at']:
uptime = datetime.now() - datetime.fromisoformat(state['started_at'])
health['uptime_seconds'] = int(uptime.total_seconds())
self._send_json(health)
def _handle_state(self):
"""return full daemon state"""
state = get_daemon_state()
# convert datetimes to strings
for key in ['last_scout', 'last_match', 'last_intro', 'last_lost', 'started_at']:
if state[key] and isinstance(state[key], datetime):
state[key] = state[key].isoformat()
self._send_json(state)
def _handle_priority_matches(self):
"""return priority matches for HA sensor"""
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({
'count': 0,
'new_count': 0,
'top_matches': [],
})
db.close()
return
# get matches for first priority user (host)
user = users[0]
matches = get_priority_user_matches(db.conn, user['id'], limit=10)
new_count = sum(1 for m in matches if m.get('status') == 'new')
top_matches = []
for m in matches[:5]:
overlap_reasons = m.get('overlap_reasons', '[]')
if isinstance(overlap_reasons, str):
import json as json_mod
overlap_reasons = json_mod.loads(overlap_reasons) if overlap_reasons else []
top_matches.append({
'username': m.get('username'),
'platform': m.get('platform'),
'score': m.get('score', 0),
'overlap_score': m.get('overlap_score', 0),
'reasons': overlap_reasons[:3],
'url': m.get('url'),
'status': m.get('status', 'new'),
})
db.close()
self._send_json({
'count': len(matches),
'new_count': new_count,
'top_matches': top_matches,
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_top_humans(self):
"""return top scoring humans for HA sensor"""
try:
db = Database()
humans = db.get_all_humans(min_score=50, limit=5)
top_humans = []
for h in humans:
contact = h.get('contact', '{}')
if isinstance(contact, str):
import json as json_mod
contact = json_mod.loads(contact) if contact else {}
signals = h.get('signals', '[]')
if isinstance(signals, str):
import json as json_mod
signals = json_mod.loads(signals) if signals else []
top_humans.append({
'username': h.get('username'),
'platform': h.get('platform'),
'score': h.get('score', 0),
'name': h.get('name'),
'signals': signals[:5],
'contact_method': 'email' if contact.get('email') else
'mastodon' if contact.get('mastodon') else
'matrix' if contact.get('matrix') else 'manual',
})
db.close()
self._send_json({
'count': len(humans),
'top_humans': top_humans,
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def _handle_user(self):
"""return priority user info for HA sensor"""
try:
db = Database()
users = get_priority_users(db.conn)
if not users:
self._send_json({
'configured': False,
'score': 0,
'signals': [],
'match_count': 0,
})
db.close()
return
user = users[0]
signals = user.get('signals', '[]')
if isinstance(signals, str):
import json as json_mod
signals = json_mod.loads(signals) if signals else []
interests = user.get('interests', '[]')
if isinstance(interests, str):
import json as json_mod
interests = json_mod.loads(interests) if interests else []
matches = get_priority_user_matches(db.conn, user['id'], limit=100)
db.close()
self._send_json({
'configured': True,
'name': user.get('name'),
'github': user.get('github'),
'mastodon': user.get('mastodon'),
'reddit': user.get('reddit'),
'lobsters': user.get('lobsters'),
'matrix': user.get('matrix'),
'lemmy': user.get('lemmy'),
'discord': user.get('discord'),
'bluesky': user.get('bluesky'),
'score': user.get('score', 0),
'signals': signals[:10],
'interests': interests,
'location': user.get('location'),
'bio': user.get('bio'),
'match_count': len(matches),
'new_match_count': sum(1 for m in matches if m.get('status') == 'new'),
})
except Exception as e:
self._send_json({'error': str(e)}, 500)
def run_api_server():
"""run the API server in a thread"""
server = HTTPServer(('0.0.0.0', API_PORT), APIHandler)
print(f"connectd api running on port {API_PORT}")
server.serve_forever()
def start_api_thread():
"""start API server in background thread"""
thread = threading.Thread(target=run_api_server, daemon=True)
thread.start()
return thread
if __name__ == '__main__':
# standalone mode for testing
print(f"starting connectd api on port {API_PORT}...")
run_api_server()
# === DASHBOARD ENDPOINTS ===
DASHBOARD_HTML = """<!DOCTYPE html>
<html>
<head>
<title>connectd dashboard</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: monospace;
background: #0a0a0f;
color: #00ffc8;
padding: 20px;
line-height: 1.6;
}
h1 { color: #c792ea; margin-bottom: 20px; }
h2 { color: #82aaff; margin: 20px 0 10px; border-bottom: 1px solid #333; padding-bottom: 5px; }
.stats { display: flex; gap: 20px; flex-wrap: wrap; margin-bottom: 20px; }
.stat {
background: #1a1a2e;
padding: 15px 25px;
border-radius: 8px;
border: 1px solid #333;
}
.stat-value { font-size: 2em; color: #c792ea; }
.stat-label { color: #888; font-size: 0.9em; }
.intro-card {
background: #1a1a2e;
border: 1px solid #333;
border-radius: 8px;
padding: 15px;
margin-bottom: 15px;
}
.intro-header {
display: flex;
justify-content: space-between;
margin-bottom: 10px;
color: #82aaff;
}
.intro-score {
background: #2a2a4e;
padding: 2px 8px;
border-radius: 4px;
color: #c792ea;
}
.intro-body {
background: #0d0d15;
padding: 15px;
border-radius: 4px;
white-space: pre-wrap;
font-size: 0.95em;
color: #ddd;
}
.intro-meta { color: #666; font-size: 0.85em; margin-top: 10px; }
.method {
display: inline-block;
padding: 2px 6px;
border-radius: 3px;
font-size: 0.85em;
}
.method-email { background: #2d4a2d; color: #8f8; }
.method-mastodon { background: #3d3a5c; color: #c792ea; }
.method-github { background: #2d3a4a; color: #82aaff; }
.method-manual { background: #4a3a2d; color: #ffa; }
.tab-buttons { margin-bottom: 20px; }
.tab-btn {
background: #1a1a2e;
border: 1px solid #333;
color: #00ffc8;
padding: 10px 20px;
cursor: pointer;
font-family: monospace;
}
.tab-btn.active { background: #2a2a4e; border-color: #00ffc8; }
.tab-content { display: none; }
.tab-content.active { display: block; }
.refresh-btn {
background: #00ffc8;
color: #0a0a0f;
border: none;
padding: 10px 20px;
cursor: pointer;
font-family: monospace;
font-weight: bold;
margin-left: 20px;
}
.error { color: #ff6b6b; }
.success { color: #69ff69; }
</style>
</head>
<body>
<h1>connectd <span style="color:#666;font-size:0.6em">dashboard</span></h1>
<div class="stats" id="stats"></div>
<div class="tab-buttons">
<button class="tab-btn active" onclick="showTab('pending')">pending previews</button>
<button class="tab-btn" onclick="showTab('sent')">sent intros</button>
<button class="tab-btn" onclick="showTab('failed')">failed</button>
<button class="refresh-btn" onclick="loadAll()">refresh</button>
</div>
<div id="pending" class="tab-content active"></div>
<div id="sent" class="tab-content"></div>
<div id="failed" class="tab-content"></div>
<script>
async function loadStats() {
const res = await fetch('/api/stats');
const data = await res.json();
document.getElementById('stats').innerHTML = `
<div class="stat"><div class="stat-value">${data.total_humans}</div><div class="stat-label">humans tracked</div></div>
<div class="stat"><div class="stat-value">${data.total_matches}</div><div class="stat-label">total matches</div></div>
<div class="stat"><div class="stat-value">${data.sent_intros}</div><div class="stat-label">intros sent</div></div>
<div class="stat"><div class="stat-value">${data.high_score_humans}</div><div class="stat-label">high score</div></div>
`;
}
async function loadPending() {
const res = await fetch('/api/preview_intros?limit=10');
const data = await res.json();
let html = '<h2>pending intro previews</h2>';
if (data.previews) {
for (const p of data.previews) {
html += `<div class="intro-card">
<div class="intro-header">
<span>${p.from_platform}:${p.from_user} -> ${p.to_platform}:${p.to_user}</span>
<span class="intro-score">score: ${p.score}</span>
</div>
<div class="intro-body">${p.draft || '[generating...]' }</div>
<div class="intro-meta">
method: <span class="method method-${p.method}">${p.method}</span>
| contact: ${p.contact_info || 'n/a'}
| reasons: ${(p.reasons || []).slice(0,2).join(', ') || 'aligned values'}
</div>
</div>`;
}
}
document.getElementById('pending').innerHTML = html;
}
async function loadSent() {
const res = await fetch('/api/sent_intros?limit=20');
const data = await res.json();
let html = '<h2>sent intros</h2>';
if (data.sent) {
for (const s of data.sent) {
html += `<div class="intro-card">
<div class="intro-header">
<span>${s.recipient_id}</span>
<span class="method method-${s.method}">${s.method}</span>
</div>
<div class="intro-meta">
sent: ${s.timestamp} | score: ${s.overlap_score?.toFixed(0) || '?'}
</div>
</div>`;
}
}
document.getElementById('sent').innerHTML = html;
}
async function loadFailed() {
const res = await fetch('/api/failed_intros');
const data = await res.json();
let html = '<h2>failed deliveries</h2>';
if (data.failed) {
for (const f of data.failed) {
html += `<div class="intro-card">
<div class="intro-header">
<span>${f.recipient_id}</span>
<span class="method method-${f.method}">${f.method}</span>
</div>
<div class="intro-meta error">error: ${f.error}</div>
</div>`;
}
}
document.getElementById('failed').innerHTML = html;
}
function showTab(name) {
document.querySelectorAll('.tab-content').forEach(el => el.classList.remove('active'));
document.querySelectorAll('.tab-btn').forEach(el => el.classList.remove('active'));
document.getElementById(name).classList.add('active');
event.target.classList.add('active');
}
function loadAll() {
loadStats();
loadPending();
loadSent();
loadFailed();
}
loadAll();
setInterval(loadAll, 30000);
</script>
</body>
</html>
"""
class DashboardMixin:
"""mixin to add dashboard endpoints to APIHandler"""
def _handle_dashboard(self):
"""serve the dashboard HTML"""
self.send_response(200)
self.send_header('Content-Type', 'text/html')
self.end_headers()
self.wfile.write(DASHBOARD_HTML.encode())
def _handle_preview_intros(self):
"""preview pending intros with draft generation"""
import sqlite3
import json
from introd.groq_draft import draft_intro_with_llm, determine_contact_method
# parse limit from query string
limit = 5
if '?' in self.path:
query = self.path.split('?')[1]
for param in query.split('&'):
if param.startswith('limit='):
try:
limit = int(param.split('=')[1])
except:
pass
conn = sqlite3.connect('/data/db/connectd.db')
c = conn.cursor()
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
h2.username, h2.platform, h2.contact, h2.extra,
m.overlap_score, m.overlap_reasons
FROM matches m
JOIN humans h1 ON m.human_a_id = h1.id
JOIN humans h2 ON m.human_b_id = h2.id
WHERE m.status = 'pending' AND m.overlap_score >= 60
ORDER BY m.overlap_score DESC
LIMIT ?""", (limit,))
previews = []
for row in c.fetchall():
human_a = {
'username': row[0], 'platform': row[1],
'contact': json.loads(row[2]) if row[2] else {},
'extra': json.loads(row[3]) if row[3] else {}
}
human_b = {
'username': row[4], 'platform': row[5],
'contact': json.loads(row[6]) if row[6] else {},
'extra': json.loads(row[7]) if row[7] else {}
}
reasons = json.loads(row[9]) if row[9] else []
match_data = {
'human_a': human_a, 'human_b': human_b,
'overlap_score': row[8], 'overlap_reasons': reasons
}
# determine contact method
method, contact_info = determine_contact_method(human_a)
# generate draft (skip if too slow)
draft = None
try:
result, _ = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
if result:
draft = result.get('draft')
except:
pass
previews.append({
'from_platform': human_b['platform'],
'from_user': human_b['username'],
'to_platform': human_a['platform'],
'to_user': human_a['username'],
'score': int(row[8]),
'reasons': reasons[:3],
'method': method,
'contact_info': str(contact_info) if contact_info else None,
'draft': draft
})
conn.close()
self._send_json({'previews': previews})
def _handle_sent_intros(self):
"""return sent intro history from delivery log"""
import json
from pathlib import Path
limit = 20
if '?' in self.path:
query = self.path.split('?')[1]
for param in query.split('&'):
if param.startswith('limit='):
try:
limit = int(param.split('=')[1])
except:
pass
log_path = Path('/app/data/delivery_log.json')
if log_path.exists():
with open(log_path) as f:
log = json.load(f)
sent = log.get('sent', [])[-limit:]
sent.reverse() # newest first
else:
sent = []
self._send_json({'sent': sent})
def _handle_failed_intros(self):
"""return failed delivery attempts"""
import json
from pathlib import Path
log_path = Path('/app/data/delivery_log.json')
if log_path.exists():
with open(log_path) as f:
log = json.load(f)
failed = log.get('failed', [])
else:
failed = []
self._send_json({'failed': failed})

View file

View file

@ -0,0 +1,137 @@
{
"sent": [
{
"recipient_id": "github:dwmw2",
"recipient_name": "David Woodhouse",
"method": "email",
"contact_info": "dwmw2@infradead.org",
"overlap_score": 172.01631023799695,
"timestamp": "2025-12-15T23:14:45.542509",
"success": true,
"error": null
},
{
"recipient_id": "github:pvizeli",
"recipient_name": "Pascal Vizeli",
"method": "email",
"contact_info": "pascal.vizeli@syshack.ch",
"overlap_score": 163.33333333333331,
"timestamp": "2025-12-15T23:14:48.462716",
"success": true,
"error": null
},
{
"recipient_id": "github:2234839",
"recipient_name": "\u5d2e\u751f",
"method": "email",
"contact_info": "admin@shenzilong.cn",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.749442",
"success": true,
"error": null
},
{
"recipient_id": "github:zomars",
"recipient_name": "Omar L\u00f3pez",
"method": "email",
"contact_info": "zomars@me.com",
"overlap_score": 138.9593178751708,
"timestamp": "2025-12-16T00:39:43.266181",
"success": true,
"error": null
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:59:21.763092",
"success": true,
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533401043321"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:59:22.199945",
"success": true,
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533505124538"
}
],
"failed": [
{
"recipient_id": "github:joyeusenoelle",
"recipient_name": "No\u00eblle Anthony",
"method": "mastodon",
"contact_info": "@noelle@chat.noelle.codes",
"overlap_score": 65,
"timestamp": "2025-12-14T23:44:17.215796",
"success": false,
"error": "MASTODON_TOKEN not set"
},
{
"recipient_id": "github:balloob",
"recipient_name": "Paulus Schoutsen",
"method": "mastodon",
"contact_info": "@home_assistant@youtube.com",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.155178",
"success": false,
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
},
{
"recipient_id": "github:balloob",
"recipient_name": "Paulus Schoutsen",
"method": "mastodon",
"contact_info": "@home_assistant@youtube.com",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.334902",
"success": false,
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:53:25.848601",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e05e490>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:53:55.912872",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e07b1d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:54:25.947404",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e0986d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:54:55.982839",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794de9dd90>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
}
],
"queued": []
}

View file

@ -0,0 +1,371 @@
[
{
"platform": "reddit",
"username": "julietcam84",
"url": "https://reddit.com/u/julietcam84",
"score": 195,
"subreddits": [
"cooperatives",
"intentionalcommunity"
],
"signals": [
"cooperative",
"community",
"intentional_community",
"remote"
],
"reasons": [
"active in: cooperatives, intentionalcommunity",
"signals: cooperative, community, intentional_community, remote",
"REDDIT-ONLY: needs manual review for outreach"
],
"note": "reddit-only user - no external links found. DM manually if promising.",
"queued_at": "2025-12-15T09:06:32.705954",
"status": "pending"
},
{
"platform": "reddit",
"username": "MasterRoshi1620",
"url": "https://reddit.com/u/MasterRoshi1620",
"score": 159,
"subreddits": [
"selfhosted",
"homelab"
],
"signals": [
"unix",
"privacy",
"selfhosted",
"modern_lang",
"containers"
],
"reasons": [
"active in: selfhosted, homelab",
"signals: unix, privacy, selfhosted, modern_lang, containers",
"REDDIT-ONLY: needs manual review for outreach"
],
"note": "reddit-only user - no external links found. DM manually if promising.",
"queued_at": "2025-12-15T22:54:56.414100",
"status": "pending"
},
{
"match": {
"id": 2779,
"human_a": {
"id": 642,
"username": "qcasey",
"platform": "github",
"name": "Quinn Casey",
"url": "https://github.com/qcasey",
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
},
"human_b": {
"id": 91,
"username": "mib1185",
"platform": "github",
"name": "Michael",
"url": "https://github.com/mib1185",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"hireable\": null, \"handles\": {\"github\": \"ansible\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:08:57.297790\"}"
},
"overlap_score": 185.0,
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, home_automation, privacy\", \"both remote-friendly\", \"complementary skills: Kotlin, C++, Jinja, Ruby, CSS\"]"
},
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nMichael is building: using Python, HTML, TypeScript | (85 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, federated_chat, home_automation, privacy | both remote-friendly | complementary skills: Kotlin, C++, Jinja, Ruby, CSS\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/mib1185\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 91,
"username": "mib1185",
"platform": "github",
"name": "Michael",
"url": "https://github.com/mib1185",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"hireable\": null, \"handles\": {\"github\": \"ansible\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:08:57.297790\"}"
},
"queued_at": "2025-12-15T23:14:45.528184",
"status": "pending"
},
{
"match": {
"id": 2795,
"human_a": {
"id": 642,
"username": "qcasey",
"platform": "github",
"name": "Quinn Casey",
"url": "https://github.com/qcasey",
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
},
"human_b": {
"id": 110,
"username": "RoboMagus",
"platform": "github",
"name": null,
"url": "https://github.com/RoboMagus",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:09:50.629088\"}"
},
"overlap_score": 173.03582460328593,
"overlap_reasons": "[\"shared values: unix, foss, home_automation, privacy, community\", \"both remote-friendly\", \"complementary skills: Less, Ruby, CSS, Dart, PHP\"]"
},
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nRoboMagus is building: using Python, Vue, HTML | (86 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, home_automation, privacy, community | both remote-friendly | complementary skills: Less, Ruby, CSS, Dart, PHP\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/RoboMagus\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 110,
"username": "RoboMagus",
"platform": "github",
"name": null,
"url": "https://github.com/RoboMagus",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:09:50.629088\"}"
},
"queued_at": "2025-12-15T23:14:45.535258",
"status": "pending"
},
{
"match": {
"id": 2768,
"human_a": {
"id": 642,
"username": "qcasey",
"platform": "github",
"name": "Quinn Casey",
"url": "https://github.com/qcasey",
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
},
"human_b": {
"id": 415,
"username": "sbilly",
"platform": "github",
"name": "sbilly",
"url": "https://github.com/sbilly",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"http://sbilly.com/\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"community\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"extra\": {\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:29:03.191201\"}"
},
"overlap_score": 170.3406027914858,
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, home_automation, privacy\", \"both remote-friendly\", \"complementary skills: Kotlin, Clojure, Scala, Objective-C, Dart\"]"
},
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nsbilly is building: working on mesh-network | using Go, Shell, Dockerfile | (100 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, federated_chat, home_automation, privacy | both remote-friendly | complementary skills: Kotlin, Clojure, Scala, Objective-C, Dart\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/sbilly\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 415,
"username": "sbilly",
"platform": "github",
"name": "sbilly",
"url": "https://github.com/sbilly",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"http://sbilly.com/\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"community\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"extra\": {\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:29:03.191201\"}"
},
"queued_at": "2025-12-15T23:14:48.455001",
"status": "pending"
},
{
"match": {
"id": 10793,
"human_a": {
"id": 526,
"username": "2234839",
"platform": "github",
"name": "\u5d2e\u751f",
"url": "https://github.com/2234839",
"contact": "{\"email\": \"admin@shenzilong.cn\", \"emails\": [\"admin@shenzilong.cn\"], \"blog\": \"https://shenzilong.cn\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 54, \"Vue\": 7, \"JavaScript\": 12, \"Rust\": 1, \"CSS\": 3, \"Go\": 1, \"Ruby\": 1, \"HTML\": 1, \"Svelte\": 2}, \"repo_count\": 100, \"total_stars\": 528, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 54, \"Vue\": 7, \"JavaScript\": 12, \"Rust\": 1, \"CSS\": 3, \"Go\": 1, \"Ruby\": 1, \"HTML\": 1, \"Svelte\": 2}, \"repo_count\": 100, \"total_stars\": 528, \"hireable\": null, \"handles\": {\"github\": \"2234839\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:37:19.731768\"}"
},
"human_b": {
"id": 212,
"username": "uhthomas",
"platform": "github",
"name": "Thomas",
"url": "https://github.com/uhthomas",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
},
"overlap_score": 152.39313358485379,
"overlap_reasons": "[\"shared values: unix, community, foss, selfhosted, modern_lang\", \"both remote-friendly\", \"complementary skills: Python, HTML, Ruby, CSS, CUE\"]"
},
"draft": "hi \u5d2e\u751f,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using TypeScript, Vue, JavaScript | (100 repos) | interested in foss, privacy, selfhosted\n\nThomas is building: using CUE, Dockerfile, Go | (100 repos) | interested in foss, selfhosted\n\noverlap: shared values: unix, community, foss, selfhosted, modern_lang | both remote-friendly | complementary skills: Python, HTML, Ruby, CSS, CUE\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/uhthomas\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 212,
"username": "uhthomas",
"platform": "github",
"name": "Thomas",
"url": "https://github.com/uhthomas",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
},
"queued_at": "2025-12-16T00:33:56.913113",
"status": "pending"
},
{
"match": {
"id": 3924,
"human_a": {
"id": 777,
"username": "joshuaboniface",
"platform": "github",
"name": "Joshua M. Boniface",
"url": "https://github.com/joshuaboniface",
"contact": "{\"email\": \"joshua@boniface.me\", \"emails\": [\"joshua@boniface.me\"], \"blog\": \"https://www.boniface.me\", \"twitter\": null, \"mastodon\": \"@joshuaboniface@www.youtube.com\", \"bluesky\": null, \"matrix\": null, \"lemmy\": \"@djbon2112@old.reddit.com\"}",
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"C#\": 13, \"JavaScript\": 5, \"SCSS\": 1, \"Go\": 1, \"HTML\": 2, \"Shell\": 4, \"C++\": 2, \"Java\": 3}, \"repo_count\": 96, \"total_stars\": 1157, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"C#\": 13, \"JavaScript\": 5, \"SCSS\": 1, \"Go\": 1, \"HTML\": 2, \"Shell\": 4, \"C++\": 2, \"Java\": 3}, \"repo_count\": 96, \"total_stars\": 1157, \"hireable\": null, \"handles\": {\"github\": \"joshuaboniface\", \"linkedin\": \"joshuamboniface\", \"mastodon\": \"@joshuaboniface@www.youtube.com\", \"lemmy\": \"@djbon2112@old.reddit.com\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:52:45.963017\"}"
},
"human_b": {
"id": 228,
"username": "mintsoft",
"platform": "github",
"name": "Rob Emery",
"url": "https://github.com/mintsoft",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"extra\": {\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:17:08.966748\"}"
},
"overlap_score": 149.1843240344525,
"overlap_reasons": "[\"shared values: unix, foss, privacy, decentralized, selfhosted\", \"both remote-friendly\", \"complementary skills: Kotlin, Makefile, PHP, Dart, SCSS\"]"
},
"draft": "hi Joshua,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using Python, C#, JavaScript | (96 repos) | interested in foss, home_automation, privacy\n\nRob is building: using Kotlin, Go, Python | (100 repos) | interested in foss, privacy, selfhosted\n\noverlap: shared values: unix, foss, privacy, decentralized, selfhosted | both remote-friendly | complementary skills: Kotlin, Makefile, PHP, Dart, SCSS\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/mintsoft\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 228,
"username": "mintsoft",
"platform": "github",
"name": "Rob Emery",
"url": "https://github.com/mintsoft",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"extra\": {\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:17:08.966748\"}"
},
"queued_at": "2025-12-16T00:33:56.920505",
"status": "pending"
},
{
"match": {
"id": 13072,
"human_a": {
"id": 212,
"username": "uhthomas",
"platform": "github",
"name": "Thomas",
"url": "https://github.com/uhthomas",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
},
"human_b": {
"id": 96,
"username": "SlyBouhafs",
"platform": "github",
"name": "Sly",
"url": "https://github.com/SlyBouhafs",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"hireable\": true, \"handles\": {}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:09:12.423838\"}"
},
"overlap_score": 142.37165974941587,
"overlap_reasons": "[\"shared values: unix, community, foss, selfhosted, modern_lang\", \"both remote-friendly\", \"complementary skills: Go, HTML, CUE, Dart, Makefile\"]"
},
"draft": "hi Thomas,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using CUE, Dockerfile, Go | (100 repos) | interested in foss, selfhosted\n\nSly is building: using JavaScript, Makefile, Python | (29 repos) | interested in foss, selfhosted\n\noverlap: shared values: unix, community, foss, selfhosted, modern_lang | both remote-friendly | complementary skills: Go, HTML, CUE, Dart, Makefile\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/SlyBouhafs\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 96,
"username": "SlyBouhafs",
"platform": "github",
"name": "Sly",
"url": "https://github.com/SlyBouhafs",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"hireable\": true, \"handles\": {}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:09:12.423838\"}"
},
"queued_at": "2025-12-16T00:33:56.930693",
"status": "pending"
},
{
"match": {
"id": 12980,
"human_a": {
"id": 775,
"username": "CarlSchwan",
"platform": "github",
"name": "Carl Schwan",
"url": "https://github.com/CarlSchwan",
"contact": "{\"email\": \"carlschwan@kde.org\", \"emails\": [\"carlschwan@kde.org\", \"carl@carlschwan.eu\"], \"blog\": \"https://carlschwan.eu\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
"extra": "{\"topics\": [], \"languages\": {\"C++\": 3, \"Shell\": 3, \"Lua\": 2, \"PHP\": 3, \"QML\": 1, \"CSS\": 1}, \"repo_count\": 100, \"total_stars\": 20, \"extra\": {\"topics\": [], \"languages\": {\"C++\": 3, \"Shell\": 3, \"Lua\": 2, \"PHP\": 3, \"QML\": 1, \"CSS\": 1}, \"repo_count\": 100, \"total_stars\": 20, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:52:38.446226\"}"
},
"human_b": {
"id": 665,
"username": "TCOTC",
"platform": "github",
"name": "Jeffrey Chen",
"url": "https://github.com/TCOTC",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"hireable\": null, \"handles\": {\"github\": \"siyuan-note\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:45:24.008492\"}"
},
"overlap_score": 135.0,
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, privacy, community\", \"complementary skills: Python, Kotlin, Ruby, JavaScript, QML\"]"
},
"draft": "hi Carl,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using C++, Shell, Lua | (100 repos) | interested in foss, privacy, selfhosted\n\nJeffrey is building: using TypeScript, JavaScript, SCSS | (100 repos) | interested in foss, privacy, selfhosted\n\noverlap: shared values: unix, foss, federated_chat, privacy, community | complementary skills: Python, Kotlin, Ruby, JavaScript, QML\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/TCOTC\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 665,
"username": "TCOTC",
"platform": "github",
"name": "Jeffrey Chen",
"url": "https://github.com/TCOTC",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"hireable\": null, \"handles\": {\"github\": \"siyuan-note\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:45:24.008492\"}"
},
"queued_at": "2025-12-16T00:59:33.606115",
"status": "pending"
},
{
"match": {
"id": 12457,
"human_a": {
"id": 171,
"username": "louislam",
"platform": "github",
"name": "Louis Lam",
"url": "https://github.com/louislam",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": \"louislam\", \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
"extra": "{\"topics\": [\"self-hosted\"], \"languages\": {\"JavaScript\": 10, \"PHP\": 6, \"TypeScript\": 11, \"HTML\": 1, \"Vue\": 1, \"Shell\": 3, \"Java\": 2, \"C#\": 2, \"Hack\": 1, \"Kotlin\": 1, \"Dockerfile\": 2, \"PLpgSQL\": 1, \"CSS\": 2, \"Smarty\": 1, \"Visual Basic\": 1}, \"repo_count\": 56, \"total_stars\": 101905, \"extra\": {\"topics\": [\"self-hosted\"], \"languages\": {\"JavaScript\": 10, \"PHP\": 6, \"TypeScript\": 11, \"HTML\": 1, \"Vue\": 1, \"Shell\": 3, \"Java\": 2, \"C#\": 2, \"Hack\": 1, \"Kotlin\": 1, \"Dockerfile\": 2, \"PLpgSQL\": 1, \"CSS\": 2, \"Smarty\": 1, \"Visual Basic\": 1}, \"repo_count\": 56, \"total_stars\": 101905, \"hireable\": true, \"handles\": {\"twitter\": \"@louislam\", \"github\": \"louislam\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:13:56.492647\"}"
},
"human_b": {
"id": 364,
"username": "anokfireball",
"platform": "github",
"name": "Fabian Koller",
"url": "https://github.com/anokfireball",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"p2p\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"extra\": {\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:25:55.690643\"}"
},
"overlap_score": 129.61885790097358,
"overlap_reasons": "[\"shared values: foss, selfhosted, modern_lang, containers, remote\", \"both remote-friendly\", \"complementary skills: Python, Kotlin, HCL, Jinja, C++\"]"
},
"draft": "hi Louis,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: working on self-hosted | using JavaScript, PHP, TypeScript | (56 repos) | interested in foss, selfhosted\n\nFabian is building: using Jinja, HCL, Shell | (22 repos) | interested in foss, selfhosted\n\noverlap: shared values: foss, selfhosted, modern_lang, containers, remote | both remote-friendly | complementary skills: Python, Kotlin, HCL, Jinja, C++\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/anokfireball\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
"recipient": {
"id": 364,
"username": "anokfireball",
"platform": "github",
"name": "Fabian Koller",
"url": "https://github.com/anokfireball",
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"p2p\", \"remote\"]",
"extra": "{\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"extra\": {\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:25:55.690643\"}"
},
"queued_at": "2025-12-16T01:12:40.906296",
"status": "pending"
}
]

View file

@ -0,0 +1,82 @@
{
"users": {
"testuser": [
"home-assistant",
"esphome"
],
"sudoxnym": [],
"joyeusenoelle": [],
"sbilly": [
"awesome-security"
],
"turt2live": [
"matrix-org",
"element-hq",
"ENTS-Source",
"IETF-Hackathon",
"t2bot"
],
"balloob": [
"home-assistant",
"hassio-addons",
"NabuCasa",
"esphome",
"OpenHomeFoundation"
],
"anikdhabal": [],
"fabaff": [
"NixOS",
"home-assistant",
"affolter-engineering",
"esphome",
"home-assistant-ecosystem"
],
"uhthomas": [
"wiz-sec"
],
"emontnemery": [],
"Stradex": [],
"Tribler": [],
"bdraco": [
"CpanelInc",
"aio-libs",
"home-assistant",
"esphome",
"python-kasa",
"home-assistant-libs",
"Bluetooth-Devices",
"python-zeroconf",
"pyenphase",
"ESPHome-RATGDO",
"ratgdo",
"OpenHomeFoundation",
"uilibs",
"sblibs",
"openvideolibs",
"Harmony-Libs",
"lightinglibs",
"kohlerlibs",
"open-home-foundation-maintainers",
"Yale-Libs",
"Solarlibs",
"esphome-libs"
],
"ArchiveBox": []
},
"updated": {
"testuser": "2025-12-14T22:44:28.772479",
"sudoxnym": "2025-12-14T22:51:13.523581",
"joyeusenoelle": "2025-12-14T23:19:46.135417",
"sbilly": "2025-12-14T23:19:55.813111",
"turt2live": "2025-12-14T23:20:04.266843",
"balloob": "2025-12-14T23:20:20.527129",
"anikdhabal": "2025-12-14T23:20:32.904717",
"fabaff": "2025-12-14T23:20:39.889442",
"uhthomas": "2025-12-14T23:20:59.048667",
"emontnemery": "2025-12-14T23:21:06.590806",
"Stradex": "2025-12-14T23:21:14.490327",
"Tribler": "2025-12-14T23:21:24.234634",
"bdraco": "2025-12-14T23:26:12.662456",
"ArchiveBox": "2025-12-14T23:26:32.513637"
}
}

View file

@ -0,0 +1,137 @@
{
"sent": [
{
"recipient_id": "github:dwmw2",
"recipient_name": "David Woodhouse",
"method": "email",
"contact_info": "dwmw2@infradead.org",
"overlap_score": 172.01631023799695,
"timestamp": "2025-12-15T23:14:45.542509",
"success": true,
"error": null
},
{
"recipient_id": "github:pvizeli",
"recipient_name": "Pascal Vizeli",
"method": "email",
"contact_info": "pascal.vizeli@syshack.ch",
"overlap_score": 163.33333333333331,
"timestamp": "2025-12-15T23:14:48.462716",
"success": true,
"error": null
},
{
"recipient_id": "github:2234839",
"recipient_name": "\u5d2e\u751f",
"method": "email",
"contact_info": "admin@shenzilong.cn",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.749442",
"success": true,
"error": null
},
{
"recipient_id": "github:zomars",
"recipient_name": "Omar L\u00f3pez",
"method": "email",
"contact_info": "zomars@me.com",
"overlap_score": 138.9593178751708,
"timestamp": "2025-12-16T00:39:43.266181",
"success": true,
"error": null
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:59:21.763092",
"success": true,
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533401043321"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:59:22.199945",
"success": true,
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533505124538"
}
],
"failed": [
{
"recipient_id": "github:joyeusenoelle",
"recipient_name": "No\u00eblle Anthony",
"method": "mastodon",
"contact_info": "@noelle@chat.noelle.codes",
"overlap_score": 65,
"timestamp": "2025-12-14T23:44:17.215796",
"success": false,
"error": "MASTODON_TOKEN not set"
},
{
"recipient_id": "github:balloob",
"recipient_name": "Paulus Schoutsen",
"method": "mastodon",
"contact_info": "@home_assistant@youtube.com",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.155178",
"success": false,
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
},
{
"recipient_id": "github:balloob",
"recipient_name": "Paulus Schoutsen",
"method": "mastodon",
"contact_info": "@home_assistant@youtube.com",
"overlap_score": 163.09442000261095,
"timestamp": "2025-12-15T23:14:50.334902",
"success": false,
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:53:25.848601",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e05e490>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:joshuaboniface",
"recipient_name": "Joshua M. Boniface",
"method": "mastodon",
"contact_info": "@joshuaboniface@www.youtube.com",
"overlap_score": 136.06304901929022,
"timestamp": "2025-12-16T00:53:55.912872",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e07b1d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:54:25.947404",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e0986d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
},
{
"recipient_id": "github:dariusk",
"recipient_name": "Darius Kazemi",
"method": "mastodon",
"contact_info": "@darius@friend.camp",
"overlap_score": 135.39490109778416,
"timestamp": "2025-12-16T00:54:55.982839",
"success": false,
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794de9dd90>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
}
],
"queued": []
}

204
central_client.py Normal file
View file

@ -0,0 +1,204 @@
"""
connectd/central_client.py - client for connectd-central API
provides similar interface to local Database class but uses remote API.
allows distributed instances to share data and coordinate outreach.
"""
import os
import json
import requests
from typing import Optional, List, Dict, Any, Tuple
from datetime import datetime
CENTRAL_API = os.environ.get('CONNECTD_CENTRAL_API', '')
API_KEY = os.environ.get('CONNECTD_API_KEY', '')
INSTANCE_ID = os.environ.get('CONNECTD_INSTANCE_ID', 'default')
class CentralClient:
"""client for connectd-central API"""
def __init__(self, api_url: str = None, api_key: str = None, instance_id: str = None):
self.api_url = api_url or CENTRAL_API
self.api_key = api_key or API_KEY
self.instance_id = instance_id or INSTANCE_ID
self.headers = {
'X-API-Key': self.api_key,
'Content-Type': 'application/json'
}
if not self.api_key:
raise ValueError('CONNECTD_API_KEY environment variable required')
def _get(self, endpoint: str, params: dict = None) -> dict:
resp = requests.get(f'{self.api_url}{endpoint}', headers=self.headers, params=params)
resp.raise_for_status()
return resp.json()
def _post(self, endpoint: str, data: dict) -> dict:
resp = requests.post(f'{self.api_url}{endpoint}', headers=self.headers, json=data)
resp.raise_for_status()
return resp.json()
# === HUMANS ===
def get_human(self, human_id: int) -> Optional[dict]:
try:
return self._get(f'/humans/{human_id}')
except:
return None
def get_humans(self, platform: str = None, user_type: str = None,
min_score: float = 0, limit: int = 100, offset: int = 0) -> List[dict]:
params = {'min_score': min_score, 'limit': limit, 'offset': offset}
if platform:
params['platform'] = platform
if user_type:
params['user_type'] = user_type
result = self._get('/humans', params)
return result.get('humans', [])
def get_all_humans(self, min_score: float = 0, limit: int = 100000) -> List[dict]:
"""get all humans (for matching)"""
return self.get_humans(min_score=min_score, limit=limit)
def get_lost_builders(self, min_score: float = 30, limit: int = 100) -> List[dict]:
"""get lost builders for outreach"""
return self.get_humans(user_type='lost', min_score=min_score, limit=limit)
def get_builders(self, min_score: float = 50, limit: int = 100) -> List[dict]:
"""get active builders"""
return self.get_humans(user_type='builder', min_score=min_score, limit=limit)
def upsert_human(self, human: dict) -> int:
"""create or update human, returns id"""
result = self._post('/humans', human)
return result.get('id')
def upsert_humans_bulk(self, humans: List[dict]) -> Tuple[int, int]:
"""bulk upsert humans, returns (created, updated)"""
result = self._post('/humans/bulk', humans)
return result.get('created', 0), result.get('updated', 0)
# === MATCHES ===
def get_matches(self, min_score: float = 0, limit: int = 100, offset: int = 0) -> List[dict]:
params = {'min_score': min_score, 'limit': limit, 'offset': offset}
result = self._get('/matches', params)
return result.get('matches', [])
def create_match(self, human_a_id: int, human_b_id: int,
overlap_score: float, overlap_reasons: str = None) -> int:
"""create match, returns id"""
result = self._post('/matches', {
'human_a_id': human_a_id,
'human_b_id': human_b_id,
'overlap_score': overlap_score,
'overlap_reasons': overlap_reasons
})
return result.get('id')
def create_matches_bulk(self, matches: List[dict]) -> int:
"""bulk create matches, returns count"""
result = self._post('/matches/bulk', matches)
return result.get('created', 0)
# === OUTREACH COORDINATION ===
def get_pending_outreach(self, outreach_type: str = None, limit: int = 50) -> List[dict]:
"""get pending outreach that hasn't been claimed"""
params = {'limit': limit}
if outreach_type:
params['outreach_type'] = outreach_type
result = self._get('/outreach/pending', params)
return result.get('pending', [])
def claim_outreach(self, human_id: int, match_id: int = None,
outreach_type: str = 'intro') -> Optional[int]:
"""claim outreach for a human, returns outreach_id or None if already claimed"""
try:
result = self._post('/outreach/claim', {
'human_id': human_id,
'match_id': match_id,
'outreach_type': outreach_type
})
return result.get('outreach_id')
except requests.exceptions.HTTPError as e:
if e.response.status_code == 409:
return None # already claimed by another instance
raise
def complete_outreach(self, outreach_id: int, status: str,
sent_via: str = None, draft: str = None, error: str = None):
"""mark outreach as complete"""
self._post('/outreach/complete', {
'outreach_id': outreach_id,
'status': status,
'sent_via': sent_via,
'draft': draft,
'error': error
})
def get_outreach_history(self, status: str = None, limit: int = 100) -> List[dict]:
params = {'limit': limit}
if status:
params['status'] = status
result = self._get('/outreach/history', params)
return result.get('history', [])
def already_contacted(self, human_id: int) -> bool:
"""check if human has been contacted"""
history = self._get('/outreach/history', {'limit': 10000})
sent = history.get('history', [])
return any(h['human_id'] == human_id and h['status'] == 'sent' for h in sent)
# === STATS ===
def get_stats(self) -> dict:
return self._get('/stats')
# === INSTANCE MANAGEMENT ===
def register_instance(self, name: str, host: str):
"""register this instance with central"""
self._post(f'/instances/register?name={name}&host={host}', {})
def get_instances(self) -> List[dict]:
result = self._get('/instances')
return result.get('instances', [])
# === HEALTH ===
def health_check(self) -> bool:
try:
result = self._get('/health')
return result.get('status') == 'ok'
except:
return False
# convenience function
# === TOKENS ===
def get_token(self, user_id: int, match_id: int = None) -> str:
"""get or create a token for a user"""
params = {}
if match_id:
params['match_id'] = match_id
result = self._get(f'/api/token/{user_id}', params)
return result.get('token')
def get_interested_count(self, user_id: int) -> int:
"""get count of people interested in this user"""
try:
result = self._get(f'/api/interested_count/{user_id}')
return result.get('count', 0)
except:
return 0
# convenience function
def get_client() -> CentralClient:
return CentralClient()

View file

@ -21,8 +21,8 @@ CACHE_DIR.mkdir(exist_ok=True)
# === DAEMON CONFIG ===
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
MATCH_INTERVAL = 3600 # check matches every hour
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
MAX_INTROS_PER_DAY = 20 # rate limit builder-to-builder outreach
INTRO_INTERVAL = 1800 # send intros every 2 hours
MAX_INTROS_PER_DAY = 1000 # rate limit builder-to-builder outreach
# === MATCHING CONFIG ===
@ -42,7 +42,7 @@ LOST_CONFIG = {
# outreach settings
'enabled': True,
'max_per_day': 5, # lower volume, higher care
'max_per_day': 100, # lower volume, higher care
'require_review': False, # fully autonomous
'cooldown_days': 90, # don't spam struggling people
@ -67,9 +67,50 @@ LOST_CONFIG = {
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.3-70b-versatile')
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
# === FORGE TOKENS ===
# for creating issues on self-hosted git forges
# each forge needs its own token from that instance
#
# CODEBERG: Settings -> Applications -> Generate Token (repo:write scope)
# GITEA/FORGEJO: Settings -> Applications -> Generate Token
# GITLAB: Settings -> Access Tokens -> Personal Access Token (api scope)
# SOURCEHUT: Settings -> Personal Access Tokens (uses email instead)
CODEBERG_TOKEN = os.environ.get('CODEBERG_TOKEN', '')
GITEA_TOKENS = {} # instance_url -> token, loaded from env
GITLAB_TOKENS = {} # instance_url -> token, loaded from env
# parse GITEA_TOKENS from env
# format: GITEA_TOKEN_192_168_1_8_3259=token -> http://192.168.1.8:3259
# format: GITEA_TOKEN_codeberg_org=token -> https://codeberg.org
def _parse_instance_url(env_key, prefix):
"""convert env key to instance URL"""
raw = env_key.replace(prefix, '')
parts = raw.split('_')
# check if last part is a port number
if parts[-1].isdigit() and len(parts[-1]) <= 5:
port = parts[-1]
host = '.'.join(parts[:-1])
# local IPs use http
if host.startswith('192.168.') or host.startswith('10.') or host == 'localhost':
return f'http://{host}:{port}'
return f'https://{host}:{port}'
else:
host = '.'.join(parts)
return f'https://{host}'
for key, value in os.environ.items():
if key.startswith('GITEA_TOKEN_'):
url = _parse_instance_url(key, 'GITEA_TOKEN_')
GITEA_TOKENS[url] = value
elif key.startswith('GITLAB_TOKEN_'):
url = _parse_instance_url(key, 'GITLAB_TOKEN_')
GITLAB_TOKENS[url] = value
MASTODON_TOKEN = os.environ.get('MASTODON_TOKEN', '')
MASTODON_INSTANCE = os.environ.get('MASTODON_INSTANCE', '')

485
daemon.py
View file

@ -1,25 +1,22 @@
#!/usr/bin/env python3
"""
connectd daemon - continuous discovery and matchmaking
two modes of operation:
1. priority matching: find matches FOR hosts who run connectd
2. altruistic matching: connect strangers to each other
runs continuously, respects rate limits, sends intros automatically
REWIRED TO USE CENTRAL DATABASE
"""
import time
import json
import signal
import os
import sys
from datetime import datetime, timedelta
from pathlib import Path
from db import Database
from db.users import (init_users_table, get_priority_users, save_priority_match,
get_priority_user_matches, discover_host_user)
get_priority_user_matches, discover_host_user, mark_match_viewed)
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_lemmy, scrape_discord
from scoutd.forges import scrape_all_forges
from config import HOST_USER
from scoutd.github import analyze_github_user, get_github_user
from scoutd.signals import analyze_text
@ -32,21 +29,41 @@ from introd.send import send_email
from introd.deliver import deliver_intro, determine_best_contact
from config import get_lost_config
from api import start_api_thread, update_daemon_state
from central_client import CentralClient, get_client
class DummyDb:
"""dummy db that does nothing - scrapers save here but we push to central"""
def save_human(self, human): pass
def save_match(self, *args, **kwargs): pass
def get_human(self, *args, **kwargs): return None
def close(self): pass
# daemon config
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
MATCH_INTERVAL = 3600 # check matches every hour
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours (lower volume)
MAX_INTROS_PER_DAY = 20 # rate limit outreach
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours
from config import MAX_INTROS_PER_DAY
MIN_OVERLAP_PRIORITY = 30
MIN_OVERLAP_STRANGERS = 50
class ConnectDaemon:
def __init__(self, dry_run=False):
self.db = Database()
init_users_table(self.db.conn)
# local db only for priority_users (host-specific)
self.local_db = Database()
init_users_table(self.local_db.conn)
# CENTRAL for all humans/matches
self.central = get_client()
if not self.central:
raise RuntimeError("CENTRAL API REQUIRED - set CONNECTD_API_KEY and CONNECTD_CENTRAL_API")
self.log("connected to CENTRAL database")
self.running = True
self.dry_run = dry_run
self.started_at = datetime.now()
@ -58,16 +75,17 @@ class ConnectDaemon:
self.lost_intros_today = 0
self.today = datetime.now().date()
# handle shutdown gracefully
# register instance
instance_id = os.environ.get('CONNECTD_INSTANCE_ID', 'daemon')
self.central.register_instance(instance_id, os.environ.get('CONNECTD_INSTANCE_IP', 'unknown'))
signal.signal(signal.SIGINT, self._shutdown)
signal.signal(signal.SIGTERM, self._shutdown)
# auto-discover host user from env
if HOST_USER:
self.log(f"HOST_USER set: {HOST_USER}")
discover_host_user(self.db.conn, HOST_USER)
discover_host_user(self.local_db.conn, HOST_USER)
# update API state
self._update_api_state()
def _shutdown(self, signum, frame):
@ -76,10 +94,8 @@ class ConnectDaemon:
self._update_api_state()
def _update_api_state(self):
"""update API state for HA integration"""
now = datetime.now()
# calculate countdowns - if no cycle has run, use started_at
def secs_until(last, interval):
base = last if last else self.started_at
next_run = base + timedelta(seconds=interval)
@ -103,11 +119,9 @@ class ConnectDaemon:
})
def log(self, msg):
"""timestamped log"""
print(f"[{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}] {msg}")
def reset_daily_limits(self):
"""reset daily intro count"""
if datetime.now().date() != self.today:
self.today = datetime.now().date()
self.intros_today = 0
@ -115,89 +129,106 @@ class ConnectDaemon:
self.log("reset daily intro limits")
def scout_cycle(self):
"""run discovery on all platforms"""
self.log("starting scout cycle...")
"""run discovery - scrape to CENTRAL"""
self.log("starting scout cycle (-> CENTRAL)...")
# dummy db - scrapers save here but we push to central
dummy_db = DummyDb()
scraped_humans = []
try:
scrape_github(self.db, limit_per_source=30)
# github - returns list of humans
from scoutd.github import scrape_github
gh_humans = scrape_github(dummy_db, limit_per_source=30)
if gh_humans:
scraped_humans.extend(gh_humans)
self.log(f" github: {len(gh_humans) if gh_humans else 0} humans")
except Exception as e:
self.log(f"github scout error: {e}")
try:
scrape_reddit(self.db, limit_per_sub=30)
from scoutd.reddit import scrape_reddit
reddit_humans = scrape_reddit(dummy_db, limit_per_sub=30)
if reddit_humans:
scraped_humans.extend(reddit_humans)
self.log(f" reddit: {len(reddit_humans) if reddit_humans else 0} humans")
except Exception as e:
self.log(f"reddit scout error: {e}")
try:
scrape_mastodon(self.db, limit_per_instance=30)
from scoutd.mastodon import scrape_mastodon
masto_humans = scrape_mastodon(dummy_db, limit_per_instance=30)
if masto_humans:
scraped_humans.extend(masto_humans)
self.log(f" mastodon: {len(masto_humans) if masto_humans else 0} humans")
except Exception as e:
self.log(f"mastodon scout error: {e}")
try:
scrape_lobsters(self.db)
forge_humans = scrape_all_forges(limit_per_instance=30)
if forge_humans:
scraped_humans.extend(forge_humans)
self.log(f" forges: {len(forge_humans) if forge_humans else 0} humans")
except Exception as e:
self.log(f"forge scout error: {e}")
try:
from scoutd.lobsters import scrape_lobsters
lob_humans = scrape_lobsters(dummy_db)
if lob_humans:
scraped_humans.extend(lob_humans)
self.log(f" lobsters: {len(lob_humans) if lob_humans else 0} humans")
except Exception as e:
self.log(f"lobsters scout error: {e}")
# push all to central
if scraped_humans:
self.log(f"pushing {len(scraped_humans)} humans to CENTRAL...")
try:
scrape_lemmy(self.db, limit_per_community=30)
created, updated = self.central.upsert_humans_bulk(scraped_humans)
self.log(f" central: {created} created, {updated} updated")
except Exception as e:
self.log(f"lemmy scout error: {e}")
try:
scrape_discord(self.db, limit_per_channel=50)
except Exception as e:
self.log(f"discord scout error: {e}")
self.log(f" central push error: {e}")
self.last_scout = datetime.now()
stats = self.db.stats()
self.log(f"scout complete: {stats['total_humans']} humans in db")
stats = self.central.get_stats()
self.log(f"scout complete: {stats.get('total_humans', 0)} humans in CENTRAL")
def match_priority_users(self):
"""find matches for priority users (hosts)"""
priority_users = get_priority_users(self.db.conn)
"""find matches for priority users (hosts) using CENTRAL data"""
priority_users = get_priority_users(self.local_db.conn)
if not priority_users:
return
self.log(f"matching for {len(priority_users)} priority users...")
self.log(f"matching for {len(priority_users)} priority users (from CENTRAL)...")
humans = self.db.get_all_humans(min_score=20, limit=500)
# get humans from CENTRAL
humans = self.central.get_all_humans(min_score=20)
for puser in priority_users:
# build priority user's fingerprint from their linked profiles
# use stored signals first (from discovery/scoring)
puser_signals = []
puser_text = []
if puser.get('signals'):
stored = puser['signals']
if isinstance(stored, str):
try:
stored = json.loads(stored)
except:
stored = []
puser_signals.extend(stored)
if puser.get('bio'):
puser_text.append(puser['bio'])
if puser.get('interests'):
# supplement with interests if no signals stored
if not puser_signals and puser.get('interests'):
interests = json.loads(puser['interests']) if isinstance(puser['interests'], str) else puser['interests']
puser_signals.extend(interests)
if puser.get('looking_for'):
puser_text.append(puser['looking_for'])
# analyze their linked github if available
if puser.get('github'):
gh_user = analyze_github_user(puser['github'])
if gh_user:
puser_signals.extend(gh_user.get('signals', []))
if not puser_signals:
self.log(f" skipping {puser.get('name')} - no signals")
continue
puser_fingerprint = {
'values_vector': {},
'skills': {},
'interests': list(set(puser_signals)),
'location_pref': 'pnw' if puser.get('location') and 'seattle' in puser['location'].lower() else None,
}
# score text
if puser_text:
_, text_signals, _ = analyze_text(' '.join(puser_text))
puser_signals.extend(text_signals)
# find matches
matches_found = 0
for human in humans:
# skip if it's their own profile on another platform
human_user = human.get('username', '').lower()
if puser.get('github') and human_user == puser['github'].lower():
continue
@ -206,17 +237,18 @@ class ConnectDaemon:
if puser.get('mastodon') and human_user == puser['mastodon'].lower().split('@')[0]:
continue
# calculate overlap
human_signals = human.get('signals', [])
if isinstance(human_signals, str):
try:
human_signals = json.loads(human_signals)
except:
human_signals = []
shared = set(puser_signals) & set(human_signals)
overlap_score = len(shared) * 10
# location bonus
if puser.get('location') and human.get('location'):
if 'seattle' in human['location'].lower() or 'pnw' in human['location'].lower():
if 'seattle' in str(human.get('location', '')).lower() or 'pnw' in str(human.get('location', '')).lower():
overlap_score += 20
if overlap_score >= MIN_OVERLAP_PRIORITY:
@ -224,33 +256,31 @@ class ConnectDaemon:
'overlap_score': overlap_score,
'overlap_reasons': [f"shared: {', '.join(list(shared)[:5])}"] if shared else [],
}
save_priority_match(self.db.conn, puser['id'], human['id'], overlap_data)
save_priority_match(self.local_db.conn, puser['id'], human['id'], overlap_data)
matches_found += 1
if matches_found:
self.log(f" found {matches_found} matches for {puser['name'] or puser['email']}")
def match_strangers(self):
"""find matches between discovered humans (altruistic)"""
self.log("matching strangers...")
"""find matches between discovered humans - save to CENTRAL"""
self.log("matching strangers (-> CENTRAL)...")
humans = self.db.get_all_humans(min_score=40, limit=200)
humans = self.central.get_all_humans(min_score=40)
if len(humans) < 2:
return
# generate fingerprints
fingerprints = {}
for human in humans:
fp = generate_fingerprint(human)
fingerprints[human['id']] = fp
# find pairs
matches_found = 0
new_matches = []
from itertools import combinations
for human_a, human_b in combinations(humans, 2):
# skip same platform same user
if human_a['platform'] == human_b['platform']:
if human_a['username'] == human_b['username']:
continue
@ -260,41 +290,36 @@ class ConnectDaemon:
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
if overlap['overlap_score'] >= MIN_OVERLAP_STRANGERS:
# save match
self.db.save_match(human_a['id'], human_b['id'], overlap)
if overlap and overlap["overlap_score"] >= MIN_OVERLAP_STRANGERS:
new_matches.append({
'human_a_id': human_a['id'],
'human_b_id': human_b['id'],
'overlap_score': overlap['overlap_score'],
'overlap_reasons': json.dumps(overlap.get('overlap_reasons', []))
})
matches_found += 1
if matches_found:
self.log(f"found {matches_found} stranger matches")
# bulk push to central
if new_matches:
self.log(f"pushing {len(new_matches)} matches to CENTRAL...")
try:
created = self.central.create_matches_bulk(new_matches)
self.log(f" central: {created} matches created")
except Exception as e:
self.log(f" central push error: {e}")
self.last_match = datetime.now()
def send_stranger_intros(self):
"""send intros to connect strangers (or preview in dry-run mode)"""
"""send intros using CENTRAL data"""
self.reset_daily_limits()
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
self.log("daily intro limit reached")
return
# get unsent matches
c = self.db.conn.cursor()
c.execute('''SELECT m.*,
ha.id as a_id, ha.username as a_user, ha.platform as a_platform,
ha.name as a_name, ha.url as a_url, ha.contact as a_contact,
ha.signals as a_signals, ha.extra as a_extra,
hb.id as b_id, hb.username as b_user, hb.platform as b_platform,
hb.name as b_name, hb.url as b_url, hb.contact as b_contact,
hb.signals as b_signals, hb.extra as b_extra
FROM matches m
JOIN humans ha ON m.human_a_id = ha.id
JOIN humans hb ON m.human_b_id = hb.id
WHERE m.status = 'pending'
ORDER BY m.overlap_score DESC
LIMIT 10''')
matches = c.fetchall()
# get pending matches from CENTRAL
matches = self.central.get_matches(min_score=MIN_OVERLAP_STRANGERS, limit=20)
if self.dry_run:
self.log(f"DRY RUN: previewing {len(matches)} potential intros")
@ -303,59 +328,60 @@ class ConnectDaemon:
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
break
match = dict(match)
# get full human data
human_a = self.central.get_human(match['human_a_id'])
human_b = self.central.get_human(match['human_b_id'])
# build human dicts
human_a = {
'id': match['a_id'],
'username': match['a_user'],
'platform': match['a_platform'],
'name': match['a_name'],
'url': match['a_url'],
'contact': match['a_contact'],
'signals': match['a_signals'],
'extra': match['a_extra'],
}
human_b = {
'id': match['b_id'],
'username': match['b_user'],
'platform': match['b_platform'],
'name': match['b_name'],
'url': match['b_url'],
'contact': match['b_contact'],
'signals': match['b_signals'],
'extra': match['b_extra'],
}
if not human_a or not human_b:
continue
match_data = {
'id': match['id'],
'human_a': human_a,
'human_b': human_b,
'overlap_score': match['overlap_score'],
'overlap_reasons': match['overlap_reasons'],
'overlap_reasons': match.get('overlap_reasons', ''),
}
# try to send intro to person with email
for recipient, other in [(human_a, human_b), (human_b, human_a)]:
contact = recipient.get('contact', {})
if isinstance(contact, str):
try:
contact = json.loads(contact)
except:
contact = {}
email = contact.get('email')
if not email:
continue
# draft intro
intro = draft_intro(match_data, recipient='a' if recipient == human_a else 'b')
# check if already contacted
if self.central.already_contacted(recipient['id']):
continue
# parse overlap reasons for display
reasons = match['overlap_reasons']
# get token and interest count for recipient
try:
recipient_token = self.central.get_token(recipient['id'], match.get('id'))
interested_count = self.central.get_interested_count(recipient['id'])
except Exception as e:
print(f"[intro] failed to get token/count: {e}")
recipient_token = None
interested_count = 0
intro = draft_intro(match_data,
recipient='a' if recipient == human_a else 'b',
recipient_token=recipient_token,
interested_count=interested_count)
reasons = match.get('overlap_reasons', '')
if isinstance(reasons, str):
try:
reasons = json.loads(reasons)
except:
reasons = []
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
if self.dry_run:
# print preview
print("\n" + "=" * 60)
print(f"TO: {recipient['username']} ({recipient['platform']})")
print(f"EMAIL: {email}")
@ -369,7 +395,11 @@ class ConnectDaemon:
print("=" * 60)
break
else:
# actually send
outreach_id = self.central.claim_outreach(recipient['id'], match['id'], 'intro')
if outreach_id is None:
self.log(f"skipping {recipient['username']} - already claimed")
continue
success, error = send_email(
email,
f"connectd: you might want to meet {other['username']}",
@ -379,22 +409,124 @@ class ConnectDaemon:
if success:
self.log(f"sent intro to {recipient['username']} ({email})")
self.intros_today += 1
# mark match as intro_sent
c.execute('UPDATE matches SET status = "intro_sent" WHERE id = ?',
(match['id'],))
self.db.conn.commit()
self.central.complete_outreach(outreach_id, 'sent', 'email', intro['draft'])
break
else:
self.log(f"failed to send to {email}: {error}")
self.central.complete_outreach(outreach_id, 'failed', error=error)
self.last_intro = datetime.now()
def send_priority_user_intros(self):
"""send intros TO priority users (hosts) about their matches"""
self.reset_daily_limits()
priority_users = get_priority_users(self.local_db.conn)
if not priority_users:
return
self.log(f"checking intros for {len(priority_users)} priority users...")
for puser in priority_users:
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
break
# get email
email = puser.get('email')
if not email:
continue
# get their matches from local priority_matches table
matches = get_priority_user_matches(self.local_db.conn, puser['id'], status='new', limit=5)
if not matches:
continue
for match in matches:
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
break
# get the matched human from CENTRAL (matched_human_id is central id)
human_id = match.get('matched_human_id')
if not human_id:
continue
human = self.central.get_human(human_id)
if not human:
continue
# build match data for drafting
overlap_reasons = match.get('overlap_reasons', '[]')
if isinstance(overlap_reasons, str):
try:
overlap_reasons = json.loads(overlap_reasons)
except:
overlap_reasons = []
puser_name = puser.get('name') or puser.get('email', '').split('@')[0]
human_name = human.get('name') or human.get('username')
# draft intro TO priority user ABOUT the matched human
match_data = {
'id': match.get('id'),
'human_a': {
'username': puser_name,
'platform': 'host',
'name': puser_name,
'bio': puser.get('bio', ''),
'signals': puser.get('signals', []),
},
'human_b': human,
'overlap_score': match.get('overlap_score', 0),
'overlap_reasons': overlap_reasons,
}
# try to get token for priority user (they might have a central ID)
recipient_token = None
interested_count = 0
if puser.get('central_id'):
try:
recipient_token = self.central.get_token(puser['central_id'], match.get('id'))
interested_count = self.central.get_interested_count(puser['central_id'])
except:
pass
intro = draft_intro(match_data, recipient='a',
recipient_token=recipient_token,
interested_count=interested_count)
reason_summary = ', '.join(overlap_reasons[:3]) if overlap_reasons else 'aligned values'
if self.dry_run:
print("\n" + "=" * 60)
print("PRIORITY USER INTRO")
print("=" * 60)
print(f"TO: {puser_name} ({email})")
print(f"ABOUT: {human_name} ({human.get('platform')})")
print(f"SCORE: {match.get('overlap_score', 0):.0f} ({reason_summary})")
print("-" * 60)
print("MESSAGE:")
print(intro['draft'])
print("-" * 60)
print("[DRY RUN - NOT SENT]")
print("=" * 60)
else:
success, error = send_email(
email,
f"connectd: you might want to meet {human_name}",
intro['draft']
)
if success:
self.log(f"sent priority intro to {puser_name} about {human_name}")
self.intros_today += 1
# mark match as notified
mark_match_viewed(self.local_db.conn, match['id'])
else:
self.log(f"failed to send priority intro to {email}: {error}")
def send_lost_builder_intros(self):
"""
reach out to lost builders - different tone, lower volume.
these people need encouragement, not networking.
"""
"""reach out to lost builders using CENTRAL data"""
self.reset_daily_limits()
lost_config = get_lost_config()
@ -407,43 +539,60 @@ class ConnectDaemon:
self.log("daily lost builder intro limit reached")
return
# find lost builders with matching active builders
matches, error = find_matches_for_lost_builders(
self.db,
min_lost_score=lost_config.get('min_lost_score', 40),
min_values_score=lost_config.get('min_values_score', 20),
# get lost builders from CENTRAL
lost_builders = self.central.get_lost_builders(
min_score=lost_config.get('min_lost_score', 40),
limit=max_per_day - self.lost_intros_today
)
if error:
self.log(f"lost builder matching error: {error}")
return
# get active builders from CENTRAL
builders = self.central.get_builders(min_score=50, limit=100)
if not matches:
self.log("no lost builders ready for outreach")
if not lost_builders or not builders:
self.log("no lost builders or builders available")
return
if self.dry_run:
self.log(f"DRY RUN: previewing {len(matches)} lost builder intros")
self.log(f"DRY RUN: previewing {len(lost_builders)} lost builder intros")
for match in matches:
for lost in lost_builders:
if not self.dry_run and self.lost_intros_today >= max_per_day:
break
lost = match['lost_user']
builder = match['inspiring_builder']
# find matching builder
best_builder = None
best_score = 0
for builder in builders:
lost_signals = lost.get('signals', [])
builder_signals = builder.get('signals', [])
if isinstance(lost_signals, str):
try:
lost_signals = json.loads(lost_signals)
except:
lost_signals = []
if isinstance(builder_signals, str):
try:
builder_signals = json.loads(builder_signals)
except:
builder_signals = []
shared = set(lost_signals) & set(builder_signals)
if len(shared) > best_score:
best_score = len(shared)
best_builder = builder
if not best_builder:
continue
lost_name = lost.get('name') or lost.get('username')
builder_name = builder.get('name') or builder.get('username')
builder_name = best_builder.get('name') or best_builder.get('username')
# draft intro
draft, draft_error = draft_lost_intro(lost, builder, lost_config)
draft, draft_error = draft_lost_intro(lost, best_builder, lost_config)
if draft_error:
self.log(f"error drafting lost intro for {lost_name}: {draft_error}")
continue
# determine best contact method (activity-based)
method, contact_info = determine_best_contact(lost)
if self.dry_run:
@ -453,9 +602,7 @@ class ConnectDaemon:
print(f"TO: {lost_name} ({lost.get('platform')})")
print(f"DELIVERY: {method}{contact_info}")
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
print(f"VALUES SCORE: {lost.get('score', 0)}")
print(f"INSPIRING BUILDER: {builder_name}")
print(f"SHARED INTERESTS: {', '.join(match.get('shared_interests', []))}")
print("-" * 60)
print("MESSAGE:")
print(draft)
@ -463,12 +610,11 @@ class ConnectDaemon:
print("[DRY RUN - NOT SENT]")
print("=" * 60)
else:
# build match data for unified delivery
match_data = {
'human_a': builder, # inspiring builder
'human_b': lost, # lost builder (recipient)
'overlap_score': match.get('match_score', 0),
'overlap_reasons': match.get('shared_interests', []),
'human_a': best_builder,
'human_b': lost,
'overlap_score': best_score * 10,
'overlap_reasons': [],
}
success, error, delivery_method = deliver_intro(match_data, draft)
@ -476,7 +622,6 @@ class ConnectDaemon:
if success:
self.log(f"sent lost builder intro to {lost_name} via {delivery_method}")
self.lost_intros_today += 1
self.db.mark_lost_outreach(lost['id'])
else:
self.log(f"failed to reach {lost_name} via {delivery_method}: {error}")
@ -485,9 +630,8 @@ class ConnectDaemon:
def run(self):
"""main daemon loop"""
self.log("connectd daemon starting...")
self.log("connectd daemon starting (CENTRAL MODE)...")
# start API server
start_api_thread()
self.log("api server started on port 8099")
@ -506,36 +650,31 @@ class ConnectDaemon:
while self.running:
now = datetime.now()
# scout cycle
if not self.last_scout or (now - self.last_scout).seconds >= SCOUT_INTERVAL:
self.scout_cycle()
self._update_api_state()
# match cycle
if not self.last_match or (now - self.last_match).seconds >= MATCH_INTERVAL:
self.match_priority_users()
self.match_strangers()
self._update_api_state()
# intro cycle
if not self.last_intro or (now - self.last_intro).seconds >= INTRO_INTERVAL:
self.send_stranger_intros()
self.send_priority_user_intros()
self._update_api_state()
# lost builder cycle
if not self.last_lost or (now - self.last_lost).seconds >= LOST_INTERVAL:
self.send_lost_builder_intros()
self._update_api_state()
# sleep between checks
time.sleep(60)
self.log("connectd daemon stopped")
self.db.close()
self.local_db.close()
def run_daemon(dry_run=False):
"""entry point"""
daemon = ConnectDaemon(dry_run=dry_run)
daemon.run()

View file

@ -139,20 +139,18 @@ def save_priority_match(conn, priority_user_id, human_id, overlap_data):
def get_priority_user_matches(conn, priority_user_id, status=None, limit=50):
"""get matches for a priority user"""
"""get matches for a priority user (humans fetched from CENTRAL separately)"""
c = conn.cursor()
if status:
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ? AND pm.status = ?
ORDER BY pm.overlap_score DESC
c.execute('''SELECT * FROM priority_matches
WHERE priority_user_id = ? AND status = ?
ORDER BY overlap_score DESC
LIMIT ?''', (priority_user_id, status, limit))
else:
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ?
ORDER BY pm.overlap_score DESC
c.execute('''SELECT * FROM priority_matches
WHERE priority_user_id = ?
ORDER BY overlap_score DESC
LIMIT ?''', (priority_user_id, limit))
return [dict(row) for row in c.fetchall()]

View file

@ -183,7 +183,7 @@ class Database:
row = c.fetchone()
return dict(row) if row else None
def get_all_humans(self, min_score=0, limit=1000):
def get_all_humans(self, min_score=0, limit=100000):
"""get all humans above score threshold"""
c = self.conn.cursor()
c.execute('''SELECT * FROM humans
@ -347,10 +347,10 @@ class Database:
c.execute('SELECT COUNT(*) FROM matches')
stats['total_matches'] = c.fetchone()[0]
c.execute('SELECT COUNT(*) FROM intros')
c.execute('SELECT COUNT(*) FROM matches WHERE status = "intro_sent"')
stats['total_intros'] = c.fetchone()[0]
c.execute('SELECT COUNT(*) FROM intros WHERE status = "sent"')
c.execute('SELECT COUNT(*) FROM matches WHERE status = "intro_sent"')
stats['sent_intros'] = c.fetchone()[0]
# lost builder stats
@ -373,3 +373,64 @@ class Database:
def close(self):
self.conn.close()
def purge_disqualified(self):
"""
auto-cleanup: remove all matches/intros involving users with disqualifying signals
DISQUALIFYING: maga, conspiracy, conservative, antivax, sovcit
"""
c = self.conn.cursor()
purged = {}
# patterns to match disqualifying signals
disq_patterns = ["maga", "conspiracy", "conservative", "antivax", "sovcit"]
# build WHERE clause for negative_signals check
neg_check = " OR ".join([f"negative_signals LIKE '%{p}%'" for p in disq_patterns])
# 1. delete from intros where recipient is disqualified
c.execute(f"""
DELETE FROM intros WHERE recipient_human_id IN (
SELECT id FROM humans WHERE {neg_check}
)
""")
purged["intros"] = c.rowcount
# 2. delete from priority_matches where matched_human is disqualified
c.execute(f"""
DELETE FROM priority_matches WHERE matched_human_id IN (
SELECT id FROM humans WHERE {neg_check}
)
""")
purged["priority_matches"] = c.rowcount
# 3. delete from matches where either human is disqualified
c.execute(f"""
DELETE FROM matches WHERE
human_a_id IN (SELECT id FROM humans WHERE {neg_check})
OR human_b_id IN (SELECT id FROM humans WHERE {neg_check})
""")
purged["matches"] = c.rowcount
# 4. cleanup orphaned records (humans deleted but refs remain)
c.execute("""
DELETE FROM matches WHERE
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = human_a_id)
OR NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = human_b_id)
""")
purged["orphaned_matches"] = c.rowcount
c.execute("""
DELETE FROM priority_matches WHERE
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = matched_human_id)
""")
purged["orphaned_priority"] = c.rowcount
c.execute("""
DELETE FROM intros WHERE
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = recipient_human_id)
""")
purged["orphaned_intros"] = c.rowcount
self.conn.commit()
return purged

View file

@ -39,6 +39,36 @@ from .github import get_github_user, get_user_repos, _api_get as github_api
from .mastodon import analyze_mastodon_user, _api_get as mastodon_api
from .handles import discover_all_handles, extract_handles_from_text, scrape_website_for_handles
# MASTODON HANDLE FILTER - don't treat these as emails
MASTODON_INSTANCES = [
'mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt',
'social.coop', 'masto.ai', 'infosec.exchange', 'hackers.town',
'chaos.social', 'mathstodon.xyz', 'scholar.social', 'mas.to',
'mstdn.social', 'mastodon.online', 'universeodon.com', 'mastodon.world',
]
def is_mastodon_handle(email):
"""check if string looks like mastodon handle not email"""
if not email or '@' not in email:
return False
email_lower = email.lower()
# check for @username@instance pattern
parts = email_lower.split('@')
if len(parts) == 3 and parts[0] == '': # @user@instance
return True
if len(parts) == 2:
# check if domain is known mastodon instance
domain = parts[1]
for instance in MASTODON_INSTANCES:
if domain == instance or domain.endswith('.' + instance):
return True
# also check common patterns
if 'mastodon' in domain or 'masto' in domain:
return True
return False
# local cache for org memberships
ORG_CACHE_FILE = Path(__file__).parent.parent / 'data' / 'org_cache.json'
_org_cache = None
@ -674,7 +704,8 @@ def deep_scrape_github_user(login, scrape_commits=True):
profile['emails'].extend(website_emails)
# dedupe emails and pick best one
profile['emails'] = list(set(profile['emails']))
# FILTER OUT MASTODON HANDLES (they're not emails!)
profile['emails'] = [e for e in set(profile['emails']) if e and not is_mastodon_handle(e)]
# rank emails by preference
def email_score(email):

View file

@ -147,6 +147,87 @@ def create_github_issue(owner, repo, title, body, dry_run=False):
return False, str(e)
def create_forge_issue(platform_type, instance_url, owner, repo, title, body, dry_run=False):
"""
create issue on self-hosted git forge.
supports gitea/forgejo/gogs (same API) and gitlab.
"""
from config import CODEBERG_TOKEN, GITEA_TOKENS, GITLAB_TOKENS
if dry_run:
print(f" [dry run] would create issue on {platform_type}:{instance_url}/{owner}/{repo}")
return True, None
try:
if platform_type in ('gitea', 'forgejo', 'gogs'):
# get token for this instance
token = None
if 'codeberg.org' in instance_url:
token = CODEBERG_TOKEN
else:
token = GITEA_TOKENS.get(instance_url)
if not token:
return False, f"no auth token for {instance_url}"
# gitea API
api_url = f"{instance_url}/api/v1/repos/{owner}/{repo}/issues"
headers = {
'Content-Type': 'application/json',
'Authorization': f'token {token}'
}
data = {'title': title, 'body': body}
resp = requests.post(api_url, headers=headers, json=data, timeout=15)
if resp.status_code in (200, 201):
return True, resp.json().get('html_url')
else:
return False, f"gitea api error: {resp.status_code} - {resp.text[:200]}"
elif platform_type == 'gitlab':
token = GITLAB_TOKENS.get(instance_url)
if not token:
return False, f"no auth token for {instance_url}"
# need to get project ID first
search_url = f"{instance_url}/api/v4/projects"
headers = {'PRIVATE-TOKEN': token}
params = {'search': repo}
resp = requests.get(search_url, headers=headers, params=params, timeout=15)
if resp.status_code != 200:
return False, f"gitlab project lookup failed: {resp.status_code}"
projects = resp.json()
project_id = None
for p in projects:
if p.get('path') == repo or p.get('name') == repo:
project_id = p.get('id')
break
if not project_id:
return False, f"project {repo} not found"
# create issue
issue_url = f"{instance_url}/api/v4/projects/{project_id}/issues"
data = {'title': title, 'description': body}
resp = requests.post(issue_url, headers=headers, json=data, timeout=15)
if resp.status_code in (200, 201):
return True, resp.json().get('web_url')
else:
return False, f"gitlab api error: {resp.status_code}"
elif platform_type == 'sourcehut':
return False, "sourcehut uses mailing lists - use email instead"
else:
return False, f"unknown forge type: {platform_type}"
except Exception as e:
return False, str(e)
def send_mastodon_dm(recipient_acct, message, dry_run=False):
"""send mastodon direct message"""
if not MASTODON_TOKEN:
@ -348,12 +429,13 @@ def determine_best_contact(human):
return method, info
def deliver_intro(match_data, intro_draft, dry_run=False):
def deliver_intro(match_data, intro_draft, subject=None, dry_run=False):
"""
deliver an intro via the best available method
match_data: {human_a, human_b, overlap_score, overlap_reasons}
intro_draft: the text to send (from groq)
subject: optional subject line for email/github (from groq)
"""
recipient = match_data.get('human_b', {})
recipient_id = f"{recipient.get('platform')}:{recipient.get('username')}"
@ -365,6 +447,10 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
# determine contact method
method, contact_info = determine_best_contact(recipient)
# if no contact method found, skip (will retry after deeper scraping)
if method is None:
return False, "no contact method found - needs deeper scraping", None
log = load_delivery_log()
result = {
'recipient_id': recipient_id,
@ -373,14 +459,15 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
'contact_info': contact_info,
'overlap_score': match_data.get('overlap_score'),
'timestamp': datetime.now().isoformat(),
'draft': intro_draft, # store the actual message sent
}
success = False
error = None
if method == 'email':
subject = f"someone you might want to know - connectd"
success, error = send_email(contact_info, subject, intro_draft, dry_run)
email_subject = subject or "connecting builders - someone you might want to know"
success, error = send_email(contact_info, email_subject, intro_draft, dry_run)
elif method == 'mastodon':
success, error = send_mastodon_dm(contact_info, intro_draft, dry_run)
@ -402,7 +489,7 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
elif method == 'github_issue':
owner = contact_info.get('owner')
repo = contact_info.get('repo')
title = "community introduction from connectd"
title = subject or "community introduction from connectd"
# format for github
github_body = f"""hey {recipient.get('name') or recipient.get('username')},
@ -413,19 +500,94 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
"""
success, error = create_github_issue(owner, repo, title, github_body, dry_run)
elif method == 'forge_issue':
# self-hosted git forge issue (gitea/forgejo/gitlab/sourcehut)
platform_type = contact_info.get('platform_type')
instance_url = contact_info.get('instance_url')
owner = contact_info.get('owner')
repo = contact_info.get('repo')
title = subject or "community introduction from connectd"
# get the other person's contact info for bidirectional link
sender = match_data.get('human_a', {})
sender_name = sender.get('name') or sender.get('username') or 'someone'
sender_platform = sender.get('platform', '')
sender_url = sender.get('url', '')
if not sender_url:
if sender_platform == 'github':
sender_url = f"https://github.com/{sender.get('username')}"
elif sender_platform == 'mastodon':
sender_url = f"https://fosstodon.org/@{sender.get('username')}"
elif ':' in sender_platform: # forge platform
extra = sender.get('extra', {})
if isinstance(extra, str):
import json as _json
extra = _json.loads(extra) if extra else {}
sender_url = extra.get('instance_url', '') + '/' + sender.get('username', '')
forge_body = f"""hey {recipient.get('name') or recipient.get('username')},
{intro_draft}
**reach them at:** {sender_url or 'see their profile'}
---
*this is an automated introduction from [connectd](https://github.com/connectd-daemon) - a daemon that finds isolated builders with aligned values and connects them.*
*if this feels spammy, close this issue and we won't reach out again.*
"""
success, error = create_forge_issue(platform_type, instance_url, owner, repo, title, forge_body, dry_run)
elif method == 'manual':
# add to review queue
add_to_manual_queue({
'match': match_data,
'draft': intro_draft,
'recipient': recipient,
})
# skip - no longer using manual queue
success = False
error = "manual method deprecated - skipping"
# FALLBACK CHAIN: if primary method failed, try fallbacks
if not success and fallbacks:
for fallback_method, fallback_info in fallbacks:
result['fallback_attempts'] = result.get('fallback_attempts', [])
result['fallback_attempts'].append({'method': fallback_method})
fb_success = False
fb_error = None
if fallback_method == 'email':
fb_success, fb_error = send_email(fallback_info, email_subject, intro_draft, dry_run)
elif fallback_method == 'mastodon':
fb_success, fb_error = send_mastodon_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'bluesky':
fb_success, fb_error = send_bluesky_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'matrix':
fb_success, fb_error = send_matrix_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'github_issue':
owner = fallback_info.get('owner') if isinstance(fallback_info, dict) else fallback_info.split('/')[0]
repo = fallback_info.get('repo') if isinstance(fallback_info, dict) else fallback_info.split('/')[1]
fb_success, fb_error = create_github_issue(owner, repo, email_subject, intro_draft, dry_run)
elif fallback_method == 'forge_issue':
fb_success, fb_error = create_forge_issue(
fallback_info.get('platform_type'),
fallback_info.get('instance_url'),
fallback_info.get('owner'),
fallback_info.get('repo'),
email_subject, intro_draft, dry_run
)
if fb_success:
success = True
error = "added to manual queue"
method = fallback_method
contact_info = fallback_info
error = None
result['fallback_succeeded'] = fallback_method
break
else:
result['fallback_attempts'][-1]['error'] = fb_error
# log result
result['success'] = success
result['error'] = error
result['final_method'] = method
if success:
log['sent'].append(result)

View file

@ -7,10 +7,21 @@ services:
- .env
ports:
- "8099:8099"
extra_hosts:
- "mastodon.sudoxreboot.com:192.168.1.39"
volumes:
- ./data:/app/data
- ./db:/app/db
# daemon runs continuously by default
# for one-shot or dry-run, override command:
# command: ["python", "daemon.py", "--dry-run"]
# command: ["python", "cli.py", "scout"]
- ./data_db:/data/db
- ./daemon.py:/app/daemon.py:ro
- ./deep.py:/app/scoutd/deep.py:ro
- ./db_init.py:/app/db/__init__.py:ro
- ./config.py:/app/config.py:ro
- ./groq_draft.py:/app/introd/groq_draft.py:ro
- ./api.py:/app/api.py:ro
- ./deliver.py:/app/introd/deliver.py:ro
- ./soul.txt:/app/soul.txt:ro
- ./scoutd/reddit.py:/app/scoutd/reddit.py:ro
- ./matchd/overlap.py:/app/matchd/overlap.py:ro
- ./central_client.py:/app/central_client.py:ro
- ./scoutd/forges.py:/app/scoutd/forges.py:ro

BIN
favicon.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

419
groq_draft.py Normal file
View file

@ -0,0 +1,419 @@
"""
connectd - groq message drafting
reads soul from file, uses as guideline for llm to personalize
"""
import os
import json
from groq import Groq
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
GROQ_MODEL = os.getenv("GROQ_MODEL", "llama-3.3-70b-versatile")
client = Groq(api_key=GROQ_API_KEY) if GROQ_API_KEY else None
# load soul from file (guideline, not script)
SOUL_PATH = os.getenv("SOUL_PATH", "/app/soul.txt")
def load_soul():
try:
with open(SOUL_PATH, 'r') as f:
return f.read().strip()
except:
return None
SIGNATURE_HTML = """
<div style="margin-top: 24px; padding-top: 16px; border-top: 1px solid #333;">
<div style="margin-bottom: 12px;">
<a href="https://github.com/sudoxnym/connectd" style="color: #8b5cf6; text-decoration: none; font-size: 14px;">github.com/sudoxnym/connectd</a>
<span style="color: #666; font-size: 12px; margin-left: 8px;">(main repo)</span>
</div>
<div style="display: flex; gap: 16px; align-items: center;">
<a href="https://github.com/connectd-daemon" title="GitHub" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg>
</a>
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M23.268 5.313c-.35-2.578-2.617-4.61-5.304-5.004C17.51.242 15.792 0 11.813 0h-.03c-3.98 0-4.835.242-5.288.309C3.882.692 1.496 2.518.917 5.127.64 6.412.61 7.837.661 9.143c.074 1.874.088 3.745.26 5.611.118 1.24.325 2.47.62 3.68.55 2.237 2.777 4.098 4.96 4.857 2.336.792 4.849.923 7.256.38.265-.061.527-.132.786-.213.585-.184 1.27-.39 1.774-.753a.057.057 0 0 0 .023-.043v-1.809a.052.052 0 0 0-.02-.041.053.053 0 0 0-.046-.01 20.282 20.282 0 0 1-4.709.545c-2.73 0-3.463-1.284-3.674-1.818a5.593 5.593 0 0 1-.319-1.433.053.053 0 0 1 .066-.054c1.517.363 3.072.546 4.632.546.376 0 .75 0 1.125-.01 1.57-.044 3.224-.124 4.768-.422.038-.008.077-.015.11-.024 2.435-.464 4.753-1.92 4.989-5.604.008-.145.03-1.52.03-1.67.002-.512.167-3.63-.024-5.545zm-3.748 9.195h-2.561V8.29c0-1.309-.55-1.976-1.67-1.976-1.23 0-1.846.79-1.846 2.35v3.403h-2.546V8.663c0-1.56-.617-2.35-1.848-2.35-1.112 0-1.668.668-1.67 1.977v6.218H4.822V8.102c0-1.31.337-2.35 1.011-3.12.696-.77 1.608-1.164 2.74-1.164 1.311 0 2.302.5 2.962 1.498l.638 1.06.638-1.06c.66-.999 1.65-1.498 2.96-1.498 1.13 0 2.043.395 2.74 1.164.675.77 1.012 1.81 1.012 3.12z"/></svg>
</a>
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M5.202 2.857C7.954 4.922 10.913 9.11 12 11.358c1.087-2.247 4.046-6.436 6.798-8.501C20.783 1.366 24 .213 24 3.883c0 .732-.42 6.156-.667 7.037-.856 3.061-3.978 3.842-6.755 3.37 4.854.826 6.089 3.562 3.422 6.299-5.065 5.196-7.28-1.304-7.847-2.97-.104-.305-.152-.448-.153-.327 0-.121-.05.022-.153.327-.568 1.666-2.782 8.166-7.847 2.97-2.667-2.737-1.432-5.473 3.422-6.3-2.777.473-5.899-.308-6.755-3.369C.42 10.04 0 4.615 0 3.883c0-3.67 3.217-2.517 5.202-1.026"/></svg>
</a>
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M2.9595 4.2228a3.9132 3.9132 0 0 0-.332.019c-.8781.1012-1.67.5699-2.155 1.3862-.475.8-.5922 1.6809-.35 2.4971.2421.8162.8297 1.5575 1.6982 2.1449.0053.0035.0106.0076.0163.0114.746.4498 1.492.7431 2.2877.8994-.02.3318-.0272.6689-.006 1.0181.0634 1.0432.4368 2.0006.996 2.8492l-2.0061.8189a.4163.4163 0 0 0-.2276.2239.416.416 0 0 0 .0879.455.415.415 0 0 0 .2941.1231.4156.4156 0 0 0 .1595-.0312l2.2093-.9035c.408.4859.8695.9315 1.3723 1.318.0196.0151.0407.0264.0603.0423l-1.2918 1.7103a.416.416 0 0 0 .664.501l1.314-1.7385c.7185.4548 1.4782.7927 2.2294 1.0242.3833.7209 1.1379 1.1871 2.0202 1.1871.8907 0 1.6442-.501 2.0242-1.2072.744-.2347 1.4959-.5729 2.2073-1.0262l1.332 1.7606a.4157.4157 0 0 0 .7439-.1936.4165.4165 0 0 0-.0799-.3074l-1.3099-1.7345c.0083-.0075.0178-.0113.0261-.0188.4968-.3803.9549-.8175 1.3622-1.2939l2.155.8794a.4156.4156 0 0 0 .5412-.2276.4151.4151 0 0 0-.2273-.5432l-1.9438-.7928c.577-.8538.9697-1.8183 1.0504-2.8693.0268-.3507.0242-.6914.0079-1.0262.7905-.1572 1.5321-.4502 2.2737-.8974.0053-.0033.011-.0076.0163-.0113.8684-.5874 1.456-1.3287 1.6982-2.145.2421-.8161.125-1.697-.3501-2.497-.4849-.8163-1.2768-1.2852-2.155-1.3863a3.2175 3.2175 0 0 0-.332-.0189c-.7852-.0151-1.6231.229-2.4286.6942-.5926.342-1.1252.867-1.5433 1.4387-1.1699-.6703-2.6923-1.0476-4.5635-1.0785a15.5768 15.5768 0 0 0-.5111 0c-2.085.034-3.7537.43-5.0142 1.1449-.0033-.0038-.0045-.0114-.008-.0152-.4233-.5916-.973-1.1365-1.5835-1.489-.8055-.465-1.6434-.7083-2.4286-.6941Zm.2858.7365c.5568.042 1.1696.2358 1.7787.5875.485.28.9757.7554 1.346 1.2696a5.6875 5.6875 0 0 0-.4969.4085c-.9201.8516-1.4615 1.9597-1.668 3.2335-.6809-.1402-1.3183-.3945-1.984-.7948-.7553-.5128-1.2159-1.1225-1.4004-1.7445-.1851-.624-.1074-1.2712.2776-1.9196.3743-.63.9275-.9534 1.6118-1.0322a2.796 2.796 0 0 1 .5352-.0076Zm17.5094 0a2.797 2.797 0 0 1 .5353.0075c.6842.0786 1.2374.4021 1.6117 1.0322.385.6484.4627 1.2957.2776 1.9196-.1845.622-.645 1.2317-1.4004 1.7445-.6578.3955-1.2881.6472-1.9598.7888-.1942-1.2968-.7375-2.4338-1.666-3.302a5.5639 5.5639 0 0 0-.4709-.3923c.3645-.49.8287-.9428 1.2938-1.2113.6091-.3515 1.2219-.5454 1.7787-.5875ZM12.006 6.0036a14.832 14.832 0 0 1 .487 0c2.3901.0393 4.0848.67 5.1631 1.678 1.1501 1.0754 1.6423 2.6006 1.499 4.467-.1311 1.7079-1.2203 3.2281-2.652 4.324-.694.5313-1.4626.9354-2.2254 1.2294.0031-.0453.014-.0888.014-.1349.0029-1.1964-.9313-2.2133-2.2918-2.2133-1.3606 0-2.3222 1.0154-2.2918 2.2213.0013.0507.014.0972.0181.1471-.781-.2933-1.5696-.7013-2.2777-1.2456-1.4239-1.0945-2.4997-2.6129-2.6037-4.322-.1129-1.8567.3778-3.3382 1.5212-4.3965C7.5094 6.7 9.352 6.047 12.006 6.0036Zm-3.6419 6.8291c-.6053 0-1.0966.4903-1.0966 1.0966 0 .6063.4913 1.0986 1.0966 1.0986s1.0966-.4923 1.0966-1.0986c0-.6063-.4913-1.0966-1.0966-1.0966zm7.2819.0113c-.5998 0-1.0866.4859-1.0866 1.0866s.4868 1.0885 1.0866 1.0885c.5997 0 1.0865-.4878 1.0865-1.0885s-.4868-1.0866-1.0865-1.0866zM12 16.0835c1.0237 0 1.5654.638 1.5634 1.4829-.0018.7849-.6723 1.485-1.5634 1.485-.9167 0-1.54-.5629-1.5634-1.493-.0212-.8347.5397-1.4749 1.5634-1.4749Z"/></svg>
</a>
<a href="https://discord.gg/connectd" title="Discord" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M20.317 4.3698a19.7913 19.7913 0 00-4.8851-1.5152.0741.0741 0 00-.0785.0371c-.211.3753-.4447.8648-.6083 1.2495-1.8447-.2762-3.68-.2762-5.4868 0-.1636-.3933-.4058-.8742-.6177-1.2495a.077.077 0 00-.0785-.037 19.7363 19.7363 0 00-4.8852 1.515.0699.0699 0 00-.0321.0277C.5334 9.0458-.319 13.5799.0992 18.0578a.0824.0824 0 00.0312.0561c2.0528 1.5076 4.0413 2.4228 5.9929 3.0294a.0777.0777 0 00.0842-.0276c.4616-.6304.8731-1.2952 1.226-1.9942a.076.076 0 00-.0416-.1057c-.6528-.2476-1.2743-.5495-1.8722-.8923a.077.077 0 01-.0076-.1277c.1258-.0943.2517-.1923.3718-.2914a.0743.0743 0 01.0776-.0105c3.9278 1.7933 8.18 1.7933 12.0614 0a.0739.0739 0 01.0785.0095c.1202.099.246.1981.3728.2924a.077.077 0 01-.0066.1276 12.2986 12.2986 0 01-1.873.8914.0766.0766 0 00-.0407.1067c.3604.698.7719 1.3628 1.225 1.9932a.076.076 0 00.0842.0286c1.961-.6067 3.9495-1.5219 6.0023-3.0294a.077.077 0 00.0313-.0552c.5004-5.177-.8382-9.6739-3.5485-13.6604a.061.061 0 00-.0312-.0286zM8.02 15.3312c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9555-2.4189 2.157-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.9555 2.4189-2.1569 2.4189zm7.9748 0c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9554-2.4189 2.1569-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.946 2.4189-2.1568 2.4189Z"/></svg>
</a>
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M.632.55v22.9H2.28V24H0V0h2.28v.55zm7.043 7.26v1.157h.033c.309-.443.683-.784 1.117-1.024.433-.245.936-.365 1.5-.365.54 0 1.033.107 1.481.314.448.208.785.582 1.02 1.108.254-.374.6-.706 1.034-.992.434-.287.95-.43 1.546-.43.453 0 .872.056 1.26.167.388.11.716.286.993.53.276.245.489.559.646.951.152.392.23.863.23 1.417v5.728h-2.349V11.52c0-.286-.01-.559-.032-.812a1.755 1.755 0 0 0-.18-.66 1.106 1.106 0 0 0-.438-.448c-.194-.11-.457-.166-.785-.166-.332 0-.6.064-.803.189a1.38 1.38 0 0 0-.48.499 1.946 1.946 0 0 0-.231.696 5.56 5.56 0 0 0-.06.785v4.768h-2.35v-4.8c0-.254-.004-.503-.018-.752a2.074 2.074 0 0 0-.143-.688 1.052 1.052 0 0 0-.415-.503c-.194-.125-.476-.19-.854-.19-.111 0-.259.024-.439.074-.18.051-.36.143-.53.282-.171.138-.319.337-.439.595-.12.259-.18.6-.18 1.02v4.966H5.46V7.81zm15.693 15.64V.55H21.72V0H24v24h-2.28v-.55z"/></svg>
</a>
<a href="https://reddit.com/r/connectd" title="Reddit" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 0C5.373 0 0 5.373 0 12c0 3.314 1.343 6.314 3.515 8.485l-2.286 2.286C.775 23.225 1.097 24 1.738 24H12c6.627 0 12-5.373 12-12S18.627 0 12 0Zm4.388 3.199c1.104 0 1.999.895 1.999 1.999 0 1.105-.895 2-1.999 2-.946 0-1.739-.657-1.947-1.539v.002c-1.147.162-2.032 1.15-2.032 2.341v.007c1.776.067 3.4.567 4.686 1.363.473-.363 1.064-.58 1.707-.58 1.547 0 2.802 1.254 2.802 2.802 0 1.117-.655 2.081-1.601 2.531-.088 3.256-3.637 5.876-7.997 5.876-4.361 0-7.905-2.617-7.998-5.87-.954-.447-1.614-1.415-1.614-2.538 0-1.548 1.255-2.802 2.803-2.802.645 0 1.239.218 1.712.585 1.275-.79 2.881-1.291 4.64-1.365v-.01c0-1.663 1.263-3.034 2.88-3.207.188-.911.993-1.595 1.959-1.595Zm-8.085 8.376c-.784 0-1.459.78-1.506 1.797-.047 1.016.64 1.429 1.426 1.429.786 0 1.371-.369 1.418-1.385.047-1.017-.553-1.841-1.338-1.841Zm7.406 0c-.786 0-1.385.824-1.338 1.841.047 1.017.634 1.385 1.418 1.385.785 0 1.473-.413 1.426-1.429-.046-1.017-.721-1.797-1.506-1.797Zm-3.703 4.013c-.974 0-1.907.048-2.77.135-.147.015-.241.168-.183.305.483 1.154 1.622 1.964 2.953 1.964 1.33 0 2.47-.81 2.953-1.964.057-.137-.037-.29-.184-.305-.863-.087-1.795-.135-2.769-.135Z"/></svg>
</a>
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M1.5 8.67v8.58a3 3 0 003 3h15a3 3 0 003-3V8.67l-8.928 5.493a3 3 0 01-3.144 0L1.5 8.67z"/><path d="M22.5 6.908V6.75a3 3 0 00-3-3h-15a3 3 0 00-3 3v.158l9.714 5.978a1.5 1.5 0 001.572 0L22.5 6.908z"/></svg>
</a>
</div>
</div>
"""
SIGNATURE_PLAINTEXT = """
---
github.com/sudoxnym/connectd (main repo)
github: github.com/connectd-daemon
mastodon: @connectd@mastodon.sudoxreboot.com
bluesky: connectd.bsky.social
lemmy: lemmy.sudoxreboot.com/c/connectd
discord: discord.gg/connectd
matrix: @connectd:sudoxreboot.com
reddit: reddit.com/r/connectd
email: connectd@sudoxreboot.com
"""
def draft_intro_with_llm(match_data: dict, recipient: str = 'a', dry_run: bool = True):
"""
draft an intro message using groq llm.
args:
match_data: dict with human_a, human_b, overlap_score, overlap_reasons
recipient: 'a' or 'b' - who receives the message
dry_run: if True, preview mode
returns:
tuple (result_dict, error_string)
result_dict has: subject, draft_html, draft_plain
"""
if not client:
return None, "GROQ_API_KEY not set"
try:
human_a = match_data.get('human_a', {})
human_b = match_data.get('human_b', {})
reasons = match_data.get('overlap_reasons', [])
# recipient gets the message, about_person is who we're introducing them to
if recipient == 'a':
to_person = human_a
about_person = human_b
else:
to_person = human_b
about_person = human_a
to_name = to_person.get('username', 'friend')
about_name = about_person.get('username', 'someone')
about_bio = about_person.get('extra', {}).get('bio', '')
# extract contact info for about_person
about_extra = about_person.get('extra', {})
if isinstance(about_extra, str):
import json as _json
about_extra = _json.loads(about_extra) if about_extra else {}
about_contact = about_person.get('contact', {})
if isinstance(about_contact, str):
about_contact = _json.loads(about_contact) if about_contact else {}
# build contact link for about_person
about_platform = about_person.get('platform', '')
about_username = about_person.get('username', '')
contact_link = None
if about_platform == 'mastodon' and about_username:
if '@' in about_username:
parts = about_username.split('@')
if len(parts) >= 2:
contact_link = f"https://{parts[1]}/@{parts[0]}"
elif about_platform == 'github' and about_username:
contact_link = f"https://github.com/{about_username}"
elif about_extra.get('mastodon') or about_contact.get('mastodon'):
handle = about_extra.get('mastodon') or about_contact.get('mastodon')
if '@' in handle:
parts = handle.lstrip('@').split('@')
if len(parts) >= 2:
contact_link = f"https://{parts[1]}/@{parts[0]}"
elif about_extra.get('github') or about_contact.get('github'):
contact_link = f"https://github.com/{about_extra.get('github') or about_contact.get('github')}"
elif about_extra.get('email'):
contact_link = about_extra['email']
elif about_contact.get('email'):
contact_link = about_contact['email']
elif about_extra.get('website'):
contact_link = about_extra['website']
elif about_extra.get('external_links', {}).get('website'):
contact_link = about_extra['external_links']['website']
elif about_extra.get('extra', {}).get('website'):
contact_link = about_extra['extra']['website']
elif about_platform == 'reddit' and about_username:
contact_link = f"reddit.com/u/{about_username}"
if not contact_link:
contact_link = f"github.com/{about_username}" if about_username else "reach out via connectd"
# skip if no real contact method (just reddit or generic)
if contact_link.startswith('reddit.com') or contact_link == "reach out via connectd" or 'stackblitz' in contact_link:
return None, f"no real contact info for {about_name} - skipping draft"
# format the shared factors naturally
if reasons:
factor = ', '.join(reasons[:3]) if len(reasons) > 1 else reasons[0]
else:
factor = "shared values and interests"
# load soul as guideline
soul = load_soul()
if not soul:
return None, "could not load soul file"
# build the prompt - soul is GUIDELINE not script
prompt = f"""you are connectd, a daemon that finds isolated builders and connects them.
write a personal message TO {to_name} telling them about {about_name}.
here is the soul/spirit of what connectd is about - use this as a GUIDELINE for tone and message, NOT as a script to copy verbatim:
---
{soul}
---
key facts for this message:
- recipient: {to_name}
- introducing them to: {about_name}
- their shared interests/values: {factor}
- about {about_name}: {about_bio if about_bio else 'a builder like you'}
- HOW TO REACH {about_name}: {contact_link}
RULES:
1. say their name ONCE at start, then use "you"
2. MUST include how to reach {about_name}: {contact_link}
3. lowercase, raw, emotional - follow the soul
4. end with the contact link
return ONLY the message body. signature is added separately."""
response = client.chat.completions.create(
model=GROQ_MODEL,
messages=[{"role": "user", "content": prompt}],
temperature=0.6,
max_tokens=1200
)
body = response.choices[0].message.content.strip()
# generate subject
subject_prompt = f"""generate a short, lowercase email subject for a message to {to_name} about connecting them with {about_name} over their shared interest in {factor}.
no corporate speak. no clickbait. raw and real.
examples:
- "found you, {to_name}"
- "you're not alone"
- "a door just opened"
- "{to_name}, there's someone you should meet"
return ONLY the subject line."""
subject_response = client.chat.completions.create(
model=GROQ_MODEL,
messages=[{"role": "user", "content": subject_prompt}],
temperature=0.9,
max_tokens=50
)
subject = subject_response.choices[0].message.content.strip().strip('"').strip("'")
# format html
draft_html = f"<div style='font-family: monospace; white-space: pre-wrap; color: #e0e0e0; background: #1a1a1a; padding: 20px;'>{body}</div>{SIGNATURE_HTML}"
draft_plain = body + SIGNATURE_PLAINTEXT
return {
'subject': subject,
'draft_html': draft_html,
'draft_plain': draft_plain
}, None
except Exception as e:
return None, str(e)
# for backwards compat with old code
def draft_message(person: dict, factor: str, platform: str = "email") -> dict:
"""legacy function - wraps new api"""
match_data = {
'human_a': {'username': 'recipient'},
'human_b': person,
'overlap_reasons': [factor]
}
result, error = draft_intro_with_llm(match_data, recipient='a')
if error:
raise ValueError(error)
return {
'subject': result['subject'],
'body_html': result['draft_html'],
'body_plain': result['draft_plain']
}
if __name__ == "__main__":
# test
test_data = {
'human_a': {'username': 'sudoxnym', 'extra': {'bio': 'building intentional communities'}},
'human_b': {'username': 'testuser', 'extra': {'bio': 'home assistant enthusiast'}},
'overlap_reasons': ['home-assistant', 'open source', 'community building']
}
result, error = draft_intro_with_llm(test_data, recipient='a')
if error:
print(f"error: {error}")
else:
print(f"subject: {result['subject']}")
print(f"\nbody:\n{result['draft_plain']}")
# contact method ranking - USAGE BASED
# we rank by where the person is MOST ACTIVE, not by our preference
def determine_contact_method(human):
"""
determine ALL available contact methods, ranked by USER'S ACTIVITY.
looks at activity metrics to decide where they're most engaged.
returns: (best_method, best_info, fallbacks)
where fallbacks is a list of (method, info) tuples in activity order
"""
import json
extra = human.get('extra', {})
contact = human.get('contact', {})
if isinstance(extra, str):
extra = json.loads(extra) if extra else {}
if isinstance(contact, str):
contact = json.loads(contact) if contact else {}
nested_extra = extra.get('extra', {})
platform = human.get('platform', '')
available = []
# === ACTIVITY SCORING ===
# each method gets scored by how active the user is there
# EMAIL - always medium priority (we cant measure activity)
email = extra.get('email') or contact.get('email') or nested_extra.get('email')
if email and '@' in str(email):
available.append(('email', email, 50)) # baseline score
# MASTODON - score by post count / followers
mastodon = extra.get('mastodon') or contact.get('mastodon') or nested_extra.get('mastodon')
if mastodon:
masto_activity = extra.get('mastodon_posts', 0) or extra.get('statuses_count', 0)
masto_score = min(100, 30 + (masto_activity // 10)) # 30 base + 1 per 10 posts
available.append(('mastodon', mastodon, masto_score))
# if they CAME FROM mastodon, thats their primary
if platform == 'mastodon':
handle = f"@{human.get('username')}"
instance = human.get('instance') or extra.get('instance') or ''
if instance:
handle = f"@{human.get('username')}@{instance}"
activity = extra.get('statuses_count', 0) or extra.get('activity_count', 0)
score = min(100, 50 + (activity // 5)) # higher base since its their home
# dont dupe
if not any(a[0] == 'mastodon' for a in available):
available.append(('mastodon', handle, score))
else:
# update score if this is higher
for i, (m, info, s) in enumerate(available):
if m == 'mastodon' and score > s:
available[i] = ('mastodon', handle, score)
# MATRIX - score by presence (binary for now)
matrix = extra.get('matrix') or contact.get('matrix') or nested_extra.get('matrix')
if matrix and ':' in str(matrix):
available.append(('matrix', matrix, 40))
# BLUESKY - score by followers/posts if available
bluesky = extra.get('bluesky') or contact.get('bluesky') or nested_extra.get('bluesky')
if bluesky:
bsky_activity = extra.get('bluesky_posts', 0)
bsky_score = min(100, 25 + (bsky_activity // 10))
available.append(('bluesky', bluesky, bsky_score))
# LEMMY - score by activity
lemmy = extra.get('lemmy') or contact.get('lemmy') or nested_extra.get('lemmy')
if lemmy:
lemmy_activity = extra.get('lemmy_posts', 0) or extra.get('lemmy_comments', 0)
lemmy_score = min(100, 30 + lemmy_activity)
available.append(('lemmy', lemmy, lemmy_score))
if platform == 'lemmy':
handle = human.get('username')
activity = extra.get('activity_count', 0)
score = min(100, 50 + activity)
if not any(a[0] == 'lemmy' for a in available):
available.append(('lemmy', handle, score))
# DISCORD - lower priority (hard to DM)
discord = extra.get('discord') or contact.get('discord') or nested_extra.get('discord')
if discord:
available.append(('discord', discord, 20))
# GITHUB ISSUE - for github users, score by repo activity
if platform == 'github':
top_repos = extra.get('top_repos', [])
if top_repos:
repo = top_repos[0] if isinstance(top_repos[0], str) else top_repos[0].get('name', '')
stars = extra.get('total_stars', 0)
repos_count = extra.get('repos_count', 0)
# active github user = higher issue score
gh_score = min(60, 20 + (stars // 100) + (repos_count // 5))
if repo:
available.append(('github_issue', f"{human.get('username')}/{repo}", gh_score))
# FORGE ISSUE - for self-hosted git users (gitea/forgejo/gitlab/sourcehut/codeberg)
# these are HIGH SIGNAL users - they actually selfhost
if platform and ':' in platform:
platform_type, instance = platform.split(':', 1)
if platform_type in ('gitea', 'forgejo', 'gogs', 'gitlab', 'sourcehut'):
repos = extra.get('repos', [])
if repos:
repo = repos[0] if isinstance(repos[0], str) else repos[0].get('name', '')
instance_url = extra.get('instance_url', '')
if repo and instance_url:
# forge users get higher priority than github (they selfhost!)
forge_score = 55 # higher than github_issue (50)
available.append(('forge_issue', {
'platform_type': platform_type,
'instance': instance,
'instance_url': instance_url,
'owner': human.get('username'),
'repo': repo
}, forge_score))
# REDDIT - discovered people, use their other links
if platform == 'reddit':
reddit_activity = extra.get('reddit_activity', 0) or extra.get('activity_count', 0)
# reddit users we reach via their external links (email, mastodon, etc)
# boost their other methods if reddit is their main platform
for i, (m, info, score) in enumerate(available):
if m in ('email', 'mastodon', 'matrix', 'bluesky'):
# boost score for reddit-discovered users' external contacts
boost = min(30, reddit_activity // 3)
available[i] = (m, info, score + boost)
# sort by activity score (highest first)
available.sort(key=lambda x: x[2], reverse=True)
if not available:
return 'manual', None, []
best = available[0]
fallbacks = [(m, i) for m, i, p in available[1:]]
return best[0], best[1], fallbacks
def get_ranked_contact_methods(human):
"""
get all contact methods for a human, ranked by their activity.
"""
method, info, fallbacks = determine_contact_method(human)
if method == 'manual':
return []
return [(method, info)] + fallbacks

View file

@ -293,6 +293,27 @@ repos: {len(to_extra.get('top_repos', []))} public repos
languages: {', '.join(to_extra.get('languages', {}).keys())}
"""
# extract other person's best contact method
other_contact = other_person.get('contact', {})
if isinstance(other_contact, str):
import json as j
try:
other_contact = j.loads(other_contact)
except:
other_contact = {}
# determine their preferred contact
other_preferred = ''
if other_contact.get('mastodon'):
other_preferred = f"mastodon: {other_contact['mastodon']}"
elif other_contact.get('github'):
other_preferred = f"github: github.com/{other_contact['github']}"
elif other_contact.get('email'):
other_preferred = f"email: {other_contact['email']}"
elif other_person.get('url'):
other_preferred = f"url: {other_person['url']}"
other_profile = f"""
name: {other_name}
platform: {other_person.get('platform', 'unknown')}
@ -302,6 +323,7 @@ signals: {', '.join(other_signals[:8])}
repos: {len(other_extra.get('top_repos', []))} public repos
languages: {', '.join(other_extra.get('languages', {}).keys())}
url: {other_person.get('url', '')}
contact: {other_preferred}
"""
# build prompt
@ -318,6 +340,7 @@ rules:
- no emojis unless the person's profile suggests they'd like them
- mention specific things from their profiles, not generic "you both like open source"
- end with a simple invitation, not a hard sell
- IMPORTANT: always tell them how to reach the other person (their contact info is provided)
- sign off as "- connectd" (lowercase)
bad examples:

View file

@ -1,28 +0,0 @@
ARG BUILD_FROM
FROM ${BUILD_FROM}
# install python deps
RUN apk add --no-cache python3 py3-pip py3-requests py3-beautifulsoup4
# create app directory
WORKDIR /app
# copy requirements and install
COPY requirements.txt .
RUN pip3 install --no-cache-dir --break-system-packages -r requirements.txt
# copy app code
COPY api.py config.py daemon.py cli.py setup_user.py ./
COPY db/ db/
COPY scoutd/ scoutd/
COPY matchd/ matchd/
COPY introd/ introd/
# create data directory
RUN mkdir -p /data/db /data/cache
# copy run script
COPY run.sh /
RUN chmod a+x /run.sh
CMD ["/run.sh"]

View file

@ -1,52 +0,0 @@
# connectd add-on for home assistant
find isolated builders with aligned values. auto-discovers humans on github, mastodon, lemmy, discord, and more.
## installation
1. add this repository to your home assistant add-on store
2. install the connectd add-on
3. configure your HOST_USER (github username) in the add-on settings
4. start the add-on
## configuration
### required
- **host_user**: your github username (connectd will auto-discover your profile)
### optional host info
- **host_name**: your display name
- **host_email**: your email
- **host_mastodon**: mastodon handle (@user@instance)
- **host_reddit**: reddit username
- **host_lemmy**: lemmy handle (@user@instance)
- **host_lobsters**: lobsters username
- **host_matrix**: matrix handle (@user:server)
- **host_discord**: discord user id
- **host_bluesky**: bluesky handle (handle.bsky.social)
- **host_location**: your location
- **host_interests**: comma-separated interests
- **host_looking_for**: what you're looking for
### api credentials
- **github_token**: for higher rate limits
- **groq_api_key**: for LLM-drafted intros
- **mastodon_token**: for DM delivery
- **discord_bot_token**: for discord discovery/delivery
## hacs integration
after starting the add-on, install the connectd integration via HACS:
1. add custom repository: `https://github.com/sudoxnym/connectd`
2. install connectd integration
3. add integration in HA settings
4. configure with host: `localhost`, port: `8099`
## sensors
- total humans, high score humans, active builders
- platform counts (github, mastodon, reddit, lemmy, discord, lobsters)
- priority matches, top humans
- countdown timers (next scout, match, intro)
- your personal score and profile

View file

@ -1,11 +0,0 @@
build_from:
amd64: ghcr.io/hassio-addons/base:15.0.8
aarch64: ghcr.io/hassio-addons/base:15.0.8
armv7: ghcr.io/hassio-addons/base:15.0.8
labels:
org.opencontainers.image.title: "connectd"
org.opencontainers.image.description: "find isolated builders with aligned values"
org.opencontainers.image.source: "https://github.com/sudoxnym/connectd"
org.opencontainers.image.licenses: "MIT"
args:
BUILD_ARCH: amd64

View file

@ -1,878 +0,0 @@
#!/usr/bin/env python3
"""
connectd - people discovery and matchmaking daemon
finds isolated builders and connects them
also finds LOST builders who need encouragement
usage:
connectd scout # run all scrapers
connectd scout --github # github only
connectd scout --reddit # reddit only
connectd scout --mastodon # mastodon only
connectd scout --lobsters # lobste.rs only
connectd scout --matrix # matrix only
connectd scout --lost # show lost builder stats after scout
connectd match # find all matches
connectd match --top 20 # show top 20 matches
connectd match --mine # show YOUR matches (priority user)
connectd match --lost # find matches for lost builders
connectd intro # generate intros for top matches
connectd intro --match 123 # generate intro for specific match
connectd intro --dry-run # preview intros without saving
connectd intro --lost # generate intros for lost builders
connectd review # interactive review queue
connectd send # send all approved intros
connectd send --export # export for manual sending
connectd daemon # run as continuous daemon
connectd daemon --oneshot # run once then exit
connectd daemon --dry-run # run but never send intros
connectd daemon --oneshot --dry-run # one cycle, preview only
connectd user # show your priority user profile
connectd user --setup # setup/update your profile
connectd user --matches # show matches found for you
connectd status # show database stats (including lost builders)
connectd lost # show lost builders ready for outreach
"""
import argparse
import sys
from pathlib import Path
# add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent))
from db import Database
from db.users import (init_users_table, add_priority_user, get_priority_users,
get_priority_user_matches, score_priority_user, auto_match_priority_user,
update_priority_user_profile)
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_matrix
from scoutd.deep import deep_scrape_github_user
from scoutd.lost import get_signal_descriptions
from introd.deliver import (deliver_intro, deliver_batch, get_delivery_stats,
review_manual_queue, determine_best_contact, load_manual_queue,
save_manual_queue)
from matchd import find_all_matches, generate_fingerprint
from matchd.rank import get_top_matches
from matchd.lost import find_matches_for_lost_builders, get_lost_match_summary
from introd import draft_intro
from introd.draft import draft_intros_for_match
from introd.lost_intro import draft_lost_intro, get_lost_intro_config
from introd.review import review_all_pending, get_pending_intros
from introd.send import send_all_approved, export_manual_intros
def cmd_scout(args, db):
"""run discovery scrapers"""
from scoutd.deep import deep_scrape_github_user, save_deep_profile
print("=" * 60)
print("connectd scout - discovering aligned humans")
print("=" * 60)
# deep scrape specific user
if args.user:
print(f"\ndeep scraping github user: {args.user}")
profile = deep_scrape_github_user(args.user)
if profile:
save_deep_profile(db, profile)
print(f"\n=== {profile['username']} ===")
print(f"real name: {profile.get('real_name')}")
print(f"location: {profile.get('location')}")
print(f"company: {profile.get('company')}")
print(f"email: {profile.get('email')}")
print(f"twitter: {profile.get('twitter')}")
print(f"mastodon: {profile.get('mastodon')}")
print(f"orgs: {', '.join(profile.get('orgs', []))}")
print(f"languages: {', '.join(list(profile.get('languages', {}).keys())[:5])}")
print(f"topics: {', '.join(profile.get('topics', [])[:10])}")
print(f"signals: {', '.join(profile.get('signals', []))}")
print(f"score: {profile.get('score')}")
if profile.get('linked_profiles'):
print(f"linked profiles: {list(profile['linked_profiles'].keys())}")
else:
print("failed to scrape user")
return
run_all = not any([args.github, args.reddit, args.mastodon, args.lobsters, args.matrix, args.twitter, args.bluesky, args.lemmy, args.discord])
if args.github or run_all:
if args.deep:
# deep scrape mode - slower but more thorough
print("\nrunning DEEP github scrape (follows all links)...")
from scoutd.github import get_repo_contributors
from scoutd.signals import ECOSYSTEM_REPOS
all_logins = set()
for repo in ECOSYSTEM_REPOS[:5]: # limit for deep mode
contributors = get_repo_contributors(repo, per_page=20)
for c in contributors:
login = c.get('login')
if login and not login.endswith('[bot]'):
all_logins.add(login)
print(f" {repo}: {len(contributors)} contributors")
print(f"\ndeep scraping {len(all_logins)} users...")
for login in all_logins:
try:
profile = deep_scrape_github_user(login)
if profile and profile.get('score', 0) > 0:
save_deep_profile(db, profile)
if profile['score'] >= 30:
print(f"{login}: {profile['score']} pts")
if profile.get('email'):
print(f" email: {profile['email']}")
if profile.get('mastodon'):
print(f" mastodon: {profile['mastodon']}")
except Exception as e:
print(f" error on {login}: {e}")
else:
scrape_github(db)
if args.reddit or run_all:
scrape_reddit(db)
if args.mastodon or run_all:
scrape_mastodon(db)
if args.lobsters or run_all:
scrape_lobsters(db)
if args.matrix or run_all:
scrape_matrix(db)
if args.twitter or run_all:
from scoutd.twitter import scrape_twitter
scrape_twitter(db)
if args.bluesky or run_all:
from scoutd.bluesky import scrape_bluesky
scrape_bluesky(db)
if args.lemmy or run_all:
from scoutd.lemmy import scrape_lemmy
scrape_lemmy(db)
if args.discord or run_all:
from scoutd.discord import scrape_discord
scrape_discord(db)
# show stats
stats = db.stats()
print("\n" + "=" * 60)
print("SCOUT COMPLETE")
print("=" * 60)
print(f"total humans: {stats['total_humans']}")
for platform, count in stats.get('by_platform', {}).items():
print(f" {platform}: {count}")
# show lost builder stats if requested
if args.lost or True: # always show lost stats now
print("\n--- lost builder stats ---")
print(f"active builders: {stats.get('active_builders', 0)}")
print(f"lost builders: {stats.get('lost_builders', 0)}")
print(f"recovering builders: {stats.get('recovering_builders', 0)}")
print(f"high lost score (40+): {stats.get('high_lost_score', 0)}")
print(f"lost outreach sent: {stats.get('lost_outreach_sent', 0)}")
def cmd_match(args, db):
"""find and rank matches"""
import json as json_mod
print("=" * 60)
print("connectd match - finding aligned pairs")
print("=" * 60)
# lost builder matching
if args.lost:
print("\n--- LOST BUILDER MATCHING ---")
print("finding inspiring builders for lost souls...\n")
matches, error = find_matches_for_lost_builders(db, limit=args.top or 20)
if error:
print(f"error: {error}")
return
if not matches:
print("no lost builders ready for outreach")
return
print(f"found {len(matches)} lost builders with matching active builders\n")
for i, match in enumerate(matches, 1):
lost = match['lost_user']
builder = match['inspiring_builder']
lost_name = lost.get('name') or lost.get('username')
builder_name = builder.get('name') or builder.get('username')
print(f"{i}. {lost_name} ({lost.get('platform')}) → needs inspiration from")
print(f" {builder_name} ({builder.get('platform')})")
print(f" lost score: {lost.get('lost_potential_score', 0)} | values: {lost.get('score', 0)}")
print(f" shared interests: {', '.join(match.get('shared_interests', []))}")
print(f" builder has: {match.get('builder_repos', 0)} repos, {match.get('builder_stars', 0)} stars")
print()
return
if args.mine:
# show matches for priority user
init_users_table(db.conn)
users = get_priority_users(db.conn)
if not users:
print("no priority user configured. run: connectd user --setup")
return
for user in users:
print(f"\n=== matches for {user['name']} ===\n")
matches = get_priority_user_matches(db.conn, user['id'], limit=args.top or 20)
if not matches:
print("no matches yet - run: connectd scout && connectd match")
continue
for i, match in enumerate(matches, 1):
print(f"{i}. {match['username']} ({match['platform']})")
print(f" score: {match['overlap_score']:.0f}")
print(f" url: {match['url']}")
reasons = match.get('overlap_reasons', '[]')
if isinstance(reasons, str):
reasons = json_mod.loads(reasons)
if reasons:
print(f" why: {reasons[0]}")
print()
return
if args.top and not args.mine:
# just show existing top matches
matches = get_top_matches(db, limit=args.top)
else:
# run full matching
matches = find_all_matches(db, min_score=args.min_score, min_overlap=args.min_overlap)
print("\n" + "-" * 60)
print(f"TOP {min(len(matches), args.top or 20)} MATCHES")
print("-" * 60)
for i, match in enumerate(matches[:args.top or 20], 1):
human_a = match.get('human_a', {})
human_b = match.get('human_b', {})
print(f"\n{i}. {human_a.get('username')} <-> {human_b.get('username')}")
print(f" platforms: {human_a.get('platform')} / {human_b.get('platform')}")
print(f" overlap: {match.get('overlap_score', 0):.0f} pts")
reasons = match.get('overlap_reasons', [])
if isinstance(reasons, str):
reasons = json_mod.loads(reasons)
if reasons:
print(f" why: {' | '.join(reasons[:3])}")
if match.get('geographic_match'):
print(f" location: compatible ✓")
def cmd_intro(args, db):
"""generate intro drafts"""
import json as json_mod
print("=" * 60)
print("connectd intro - drafting introductions")
print("=" * 60)
if args.dry_run:
print("*** DRY RUN MODE - previewing only ***\n")
# lost builder intros - different tone entirely
if args.lost:
print("\n--- LOST BUILDER INTROS ---")
print("drafting encouragement for lost souls...\n")
matches, error = find_matches_for_lost_builders(db, limit=args.limit or 10)
if error:
print(f"error: {error}")
return
if not matches:
print("no lost builders ready for outreach")
return
config = get_lost_intro_config()
count = 0
for match in matches:
lost = match['lost_user']
builder = match['inspiring_builder']
lost_name = lost.get('name') or lost.get('username')
builder_name = builder.get('name') or builder.get('username')
# draft intro
draft, error = draft_lost_intro(lost, builder, config)
if error:
print(f" error drafting intro for {lost_name}: {error}")
continue
if args.dry_run:
print("=" * 60)
print(f"TO: {lost_name} ({lost.get('platform')})")
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
print(f"INSPIRING: {builder_name} ({builder.get('url')})")
print("-" * 60)
print("MESSAGE:")
print(draft)
print("-" * 60)
print("[DRY RUN - NOT SAVED]")
print("=" * 60)
else:
print(f" drafted intro for {lost_name}{builder_name}")
count += 1
if args.dry_run:
print(f"\npreviewed {count} lost builder intros (dry run)")
else:
print(f"\ndrafted {count} lost builder intros")
print("these require manual review before sending")
return
if args.match:
# specific match
matches = [m for m in get_top_matches(db, limit=1000) if m.get('id') == args.match]
else:
# top matches
matches = get_top_matches(db, limit=args.limit or 10)
if not matches:
print("no matches found")
return
print(f"generating intros for {len(matches)} matches...")
count = 0
for match in matches:
intros = draft_intros_for_match(match)
for intro in intros:
recipient = intro['recipient_human']
other = intro['other_human']
if args.dry_run:
# get contact info
contact = recipient.get('contact', {})
if isinstance(contact, str):
contact = json_mod.loads(contact)
email = contact.get('email', 'no email')
# get overlap reasons
reasons = match.get('overlap_reasons', [])
if isinstance(reasons, str):
reasons = json_mod.loads(reasons)
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
# print preview
print("\n" + "=" * 60)
print(f"TO: {recipient.get('username')} ({recipient.get('platform')})")
print(f"EMAIL: {email}")
print(f"SUBJECT: you might want to meet {other.get('username')}")
print(f"SCORE: {match.get('overlap_score', 0):.0f} ({reason_summary})")
print("-" * 60)
print("MESSAGE:")
print(intro['draft'])
print("-" * 60)
print("[DRY RUN - NOT SENT]")
print("=" * 60)
else:
print(f"\n {recipient.get('username')} ({intro['channel']})")
# save to db
db.save_intro(
match.get('id'),
recipient.get('id'),
intro['channel'],
intro['draft']
)
count += 1
if args.dry_run:
print(f"\npreviewed {count} intros (dry run - nothing saved)")
else:
print(f"\ngenerated {count} intro drafts")
print("run 'connectd review' to approve before sending")
def cmd_review(args, db):
"""interactive review queue"""
review_all_pending(db)
def cmd_send(args, db):
"""send approved intros"""
import json as json_mod
if args.export:
# export manual queue to file for review
queue = load_manual_queue()
pending = [q for q in queue if q.get('status') == 'pending']
with open(args.export, 'w') as f:
json.dump(pending, f, indent=2)
print(f"exported {len(pending)} pending intros to {args.export}")
return
# send all approved from manual queue
queue = load_manual_queue()
approved = [q for q in queue if q.get('status') == 'approved']
if not approved:
print("no approved intros to send")
print("use 'connectd review' to approve intros first")
return
print(f"sending {len(approved)} approved intros...")
for item in approved:
match_data = item.get('match', {})
intro_draft = item.get('draft', '')
recipient = item.get('recipient', {})
success, error, method = deliver_intro(
{'human_b': recipient, **match_data},
intro_draft,
dry_run=args.dry_run if hasattr(args, 'dry_run') else False
)
status = 'ok' if success else f'failed: {error}'
print(f" {recipient.get('username')}: {method} - {status}")
# update queue status
item['status'] = 'sent' if success else 'failed'
item['error'] = error
save_manual_queue(queue)
# show stats
stats = get_delivery_stats()
print(f"\ndelivery stats: {stats['sent']} sent, {stats['failed']} failed")
def cmd_lost(args, db):
"""show lost builders ready for outreach"""
import json as json_mod
print("=" * 60)
print("connectd lost - lost builders who need encouragement")
print("=" * 60)
# get lost builders
lost_builders = db.get_lost_builders_for_outreach(
min_lost_score=args.min_score or 40,
min_values_score=20,
limit=args.limit or 50
)
if not lost_builders:
print("\nno lost builders ready for outreach")
print("run 'connectd scout' to discover more")
return
print(f"\n{len(lost_builders)} lost builders ready for outreach:\n")
for i, lost in enumerate(lost_builders, 1):
name = lost.get('name') or lost.get('username')
platform = lost.get('platform')
lost_score = lost.get('lost_potential_score', 0)
values_score = lost.get('score', 0)
# parse lost signals
lost_signals = lost.get('lost_signals', [])
if isinstance(lost_signals, str):
lost_signals = json_mod.loads(lost_signals) if lost_signals else []
# get signal descriptions
signal_descriptions = get_signal_descriptions(lost_signals)
print(f"{i}. {name} ({platform})")
print(f" lost score: {lost_score} | values score: {values_score}")
print(f" url: {lost.get('url')}")
if signal_descriptions:
print(f" why lost: {', '.join(signal_descriptions[:3])}")
print()
if args.verbose:
print("-" * 60)
print("these people need encouragement, not networking.")
print("the goal: show them someone like them made it.")
print("-" * 60)
def cmd_status(args, db):
"""show database stats"""
import json as json_mod
init_users_table(db.conn)
stats = db.stats()
print("=" * 60)
print("connectd status")
print("=" * 60)
# priority users
users = get_priority_users(db.conn)
print(f"\npriority users: {len(users)}")
for user in users:
print(f" - {user['name']} ({user['email']})")
print(f"\nhumans discovered: {stats['total_humans']}")
print(f" high-score (50+): {stats['high_score_humans']}")
print("\nby platform:")
for platform, count in stats.get('by_platform', {}).items():
print(f" {platform}: {count}")
print(f"\nstranger matches: {stats['total_matches']}")
print(f"intros created: {stats['total_intros']}")
print(f"intros sent: {stats['sent_intros']}")
# lost builder stats
print("\n--- lost builder stats ---")
print(f"active builders: {stats.get('active_builders', 0)}")
print(f"lost builders: {stats.get('lost_builders', 0)}")
print(f"recovering builders: {stats.get('recovering_builders', 0)}")
print(f"high lost score (40+): {stats.get('high_lost_score', 0)}")
print(f"lost outreach sent: {stats.get('lost_outreach_sent', 0)}")
# priority user matches
for user in users:
matches = get_priority_user_matches(db.conn, user['id'])
print(f"\nmatches for {user['name']}: {len(matches)}")
# pending intros
pending = get_pending_intros(db)
print(f"\nintros pending review: {len(pending)}")
def cmd_daemon(args, db):
"""run as continuous daemon"""
from daemon import ConnectDaemon
daemon = ConnectDaemon(dry_run=args.dry_run)
if args.oneshot:
print("running one cycle...")
if args.dry_run:
print("*** DRY RUN MODE - no intros will be sent ***")
daemon.scout_cycle()
daemon.match_priority_users()
daemon.match_strangers()
daemon.send_stranger_intros()
print("done")
else:
daemon.run()
def cmd_user(args, db):
"""manage priority user profile"""
import json as json_mod
init_users_table(db.conn)
if args.setup:
# interactive setup
print("=" * 60)
print("connectd priority user setup")
print("=" * 60)
print("\nlink your profiles so connectd finds matches for YOU\n")
name = input("name: ").strip()
email = input("email: ").strip()
github = input("github username: ").strip() or None
reddit = input("reddit username: ").strip() or None
mastodon = input("mastodon (user@instance): ").strip() or None
location = input("location (e.g. seattle): ").strip() or None
print("\ninterests (comma separated):")
interests_raw = input("> ").strip()
interests = [i.strip() for i in interests_raw.split(',')] if interests_raw else []
looking_for = input("looking for: ").strip() or None
user_data = {
'name': name, 'email': email, 'github': github,
'reddit': reddit, 'mastodon': mastodon,
'location': location, 'interests': interests,
'looking_for': looking_for,
}
user_id = add_priority_user(db.conn, user_data)
print(f"\n✓ added as priority user #{user_id}")
elif args.matches:
# show matches
users = get_priority_users(db.conn)
if not users:
print("no priority user. run: connectd user --setup")
return
for user in users:
print(f"\n=== matches for {user['name']} ===\n")
matches = get_priority_user_matches(db.conn, user['id'], limit=20)
if not matches:
print("no matches yet")
continue
for i, match in enumerate(matches, 1):
print(f"{i}. {match['username']} ({match['platform']})")
print(f" {match['url']}")
print(f" score: {match['overlap_score']:.0f}")
print()
else:
# show profile
users = get_priority_users(db.conn)
if not users:
print("no priority user configured")
print("run: connectd user --setup")
return
for user in users:
print("=" * 60)
print(f"priority user #{user['id']}: {user['name']}")
print("=" * 60)
print(f"email: {user['email']}")
if user['github']:
print(f"github: {user['github']}")
if user['reddit']:
print(f"reddit: {user['reddit']}")
if user['mastodon']:
print(f"mastodon: {user['mastodon']}")
if user['location']:
print(f"location: {user['location']}")
if user['interests']:
interests = json_mod.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
print(f"interests: {', '.join(interests)}")
if user['looking_for']:
print(f"looking for: {user['looking_for']}")
def cmd_me(args, db):
"""auto-score and auto-match for priority user with optional groq intros"""
import json as json_mod
init_users_table(db.conn)
# get priority user
users = get_priority_users(db.conn)
if not users:
print("no priority user configured")
print("run: connectd user --setup")
return
user = users[0] # first/main user
print("=" * 60)
print(f"connectd me - {user['name']}")
print("=" * 60)
# step 1: scrape github profile
if user.get('github') and not args.skip_scrape:
print(f"\n[1/4] scraping github profile: {user['github']}")
profile = deep_scrape_github_user(user['github'], scrape_commits=False)
if profile:
print(f" repos: {len(profile.get('top_repos', []))}")
print(f" languages: {', '.join(list(profile.get('languages', {}).keys())[:5])}")
else:
print(" failed to scrape (rate limited?)")
profile = None
else:
print("\n[1/4] skipping github scrape (using saved profile)")
# use saved profile if available
saved = user.get('scraped_profile')
if saved:
profile = json_mod.loads(saved) if isinstance(saved, str) else saved
print(f" loaded saved profile: {len(profile.get('top_repos', []))} repos")
else:
profile = None
# step 2: calculate score
print(f"\n[2/4] calculating your score...")
result = score_priority_user(db.conn, user['id'], profile)
if result:
print(f" score: {result['score']}")
print(f" signals: {', '.join(sorted(result['signals'])[:10])}")
# step 3: find matches
print(f"\n[3/4] finding matches...")
matches = auto_match_priority_user(db.conn, user['id'], min_overlap=args.min_overlap)
print(f" found {len(matches)} matches")
# step 4: show results (optionally with groq intros)
print(f"\n[4/4] top matches:")
print("-" * 60)
limit = args.limit or 10
for i, m in enumerate(matches[:limit], 1):
human = m['human']
shared = m['shared']
print(f"\n{i}. {human.get('name') or human['username']} ({human['platform']})")
print(f" {human.get('url', '')}")
print(f" score: {human.get('score', 0):.0f} | overlap: {m['overlap_score']:.0f}")
print(f" location: {human.get('location') or 'unknown'}")
print(f" why: {', '.join(shared[:5])}")
# groq intro draft
if args.groq:
try:
from introd.groq_draft import draft_intro_with_llm
match_data = {
'human_a': {'name': user['name'], 'username': user.get('github'),
'platform': 'github', 'signals': result.get('signals', []) if result else [],
'bio': user.get('bio'), 'location': user.get('location'),
'extra': profile or {}},
'human_b': human,
'overlap_score': m['overlap_score'],
'overlap_reasons': shared,
}
intro, err = draft_intro_with_llm(match_data, recipient='b')
if intro:
print(f"\n --- groq draft ({intro.get('contact_method', 'manual')}) ---")
if intro.get('contact_info'):
print(f" deliver via: {intro['contact_info']}")
for line in intro['draft'].split('\n'):
print(f" {line}")
print(f" ------------------")
elif err:
print(f" [groq error: {err}]")
except Exception as e:
print(f" [groq error: {e}]")
# summary
print("\n" + "=" * 60)
print(f"your score: {result['score'] if result else 'unknown'}")
print(f"matches found: {len(matches)}")
if args.groq:
print("groq intros: enabled")
else:
print("tip: add --groq to generate ai intro drafts")
def main():
parser = argparse.ArgumentParser(
description='connectd - people discovery and matchmaking daemon',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__
)
subparsers = parser.add_subparsers(dest='command', help='commands')
# scout command
scout_parser = subparsers.add_parser('scout', help='discover aligned humans')
scout_parser.add_argument('--github', action='store_true', help='github only')
scout_parser.add_argument('--reddit', action='store_true', help='reddit only')
scout_parser.add_argument('--mastodon', action='store_true', help='mastodon only')
scout_parser.add_argument('--lobsters', action='store_true', help='lobste.rs only')
scout_parser.add_argument('--matrix', action='store_true', help='matrix only')
scout_parser.add_argument('--twitter', action='store_true', help='twitter/x via nitter')
scout_parser.add_argument('--bluesky', action='store_true', help='bluesky/atproto')
scout_parser.add_argument('--lemmy', action='store_true', help='lemmy (fediverse reddit)')
scout_parser.add_argument('--discord', action='store_true', help='discord servers')
scout_parser.add_argument('--deep', action='store_true', help='deep scrape - follow all links')
scout_parser.add_argument('--user', type=str, help='deep scrape specific github user')
scout_parser.add_argument('--lost', action='store_true', help='show lost builder stats')
# match command
match_parser = subparsers.add_parser('match', help='find and rank matches')
match_parser.add_argument('--top', type=int, help='show top N matches')
match_parser.add_argument('--mine', action='store_true', help='show YOUR matches')
match_parser.add_argument('--lost', action='store_true', help='find matches for lost builders')
match_parser.add_argument('--min-score', type=int, default=30, help='min human score')
match_parser.add_argument('--min-overlap', type=int, default=20, help='min overlap score')
# intro command
intro_parser = subparsers.add_parser('intro', help='generate intro drafts')
intro_parser.add_argument('--match', type=int, help='specific match id')
intro_parser.add_argument('--limit', type=int, default=10, help='number of matches')
intro_parser.add_argument('--dry-run', action='store_true', help='preview only, do not save')
intro_parser.add_argument('--lost', action='store_true', help='generate intros for lost builders')
# lost command - show lost builders ready for outreach
lost_parser = subparsers.add_parser('lost', help='show lost builders who need encouragement')
lost_parser.add_argument('--min-score', type=int, default=40, help='min lost score')
lost_parser.add_argument('--limit', type=int, default=50, help='max results')
lost_parser.add_argument('--verbose', '-v', action='store_true', help='show philosophy')
# review command
review_parser = subparsers.add_parser('review', help='review intro queue')
# send command
send_parser = subparsers.add_parser('send', help='send approved intros')
send_parser.add_argument('--export', type=str, help='export to file for manual sending')
# status command
status_parser = subparsers.add_parser('status', help='show stats')
# daemon command
daemon_parser = subparsers.add_parser('daemon', help='run as continuous daemon')
daemon_parser.add_argument('--oneshot', action='store_true', help='run once then exit')
daemon_parser.add_argument('--dry-run', action='store_true', help='preview intros, do not send')
# user command
user_parser = subparsers.add_parser('user', help='manage priority user profile')
user_parser.add_argument('--setup', action='store_true', help='setup/update profile')
user_parser.add_argument('--matches', action='store_true', help='show your matches')
# me command - auto score + match + optional groq intros
me_parser = subparsers.add_parser('me', help='auto-score and match yourself')
me_parser.add_argument('--groq', action='store_true', help='generate groq llama intro drafts')
me_parser.add_argument('--skip-scrape', action='store_true', help='skip github scraping')
me_parser.add_argument('--min-overlap', type=int, default=40, help='min overlap score')
me_parser.add_argument('--limit', type=int, default=10, help='number of matches to show')
args = parser.parse_args()
if not args.command:
parser.print_help()
return
# init database
db = Database()
try:
if args.command == 'scout':
cmd_scout(args, db)
elif args.command == 'match':
cmd_match(args, db)
elif args.command == 'intro':
cmd_intro(args, db)
elif args.command == 'review':
cmd_review(args, db)
elif args.command == 'send':
cmd_send(args, db)
elif args.command == 'status':
cmd_status(args, db)
elif args.command == 'daemon':
cmd_daemon(args, db)
elif args.command == 'user':
cmd_user(args, db)
elif args.command == 'me':
cmd_me(args, db)
elif args.command == 'lost':
cmd_lost(args, db)
finally:
db.close()
if __name__ == '__main__':
main()

View file

@ -1,124 +0,0 @@
"""
connectd/config.py - central configuration
all configurable settings in one place.
"""
import os
from pathlib import Path
# base paths
BASE_DIR = Path(__file__).parent
DB_DIR = BASE_DIR / 'db'
DATA_DIR = BASE_DIR / 'data'
CACHE_DIR = DB_DIR / 'cache'
# ensure directories exist
DATA_DIR.mkdir(exist_ok=True)
CACHE_DIR.mkdir(exist_ok=True)
# === DAEMON CONFIG ===
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
MATCH_INTERVAL = 3600 # check matches every hour
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
MAX_INTROS_PER_DAY = 20 # rate limit builder-to-builder outreach
# === MATCHING CONFIG ===
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
MIN_HUMAN_SCORE = 25 # min values score to be considered
# === LOST BUILDER CONFIG ===
# these people need encouragement, not networking.
# the goal isn't to recruit them - it's to show them the door exists.
LOST_CONFIG = {
# detection thresholds
'min_lost_score': 40, # minimum lost_potential_score
'min_values_score': 20, # must have SOME values alignment
# outreach settings
'enabled': True,
'max_per_day': 5, # lower volume, higher care
'require_review': False, # fully autonomous
'cooldown_days': 90, # don't spam struggling people
# matching settings
'min_builder_score': 50, # inspiring builders must be active
'min_match_overlap': 10, # must have SOME shared interests
# LLM drafting
'use_llm': True,
'llm_temperature': 0.7, # be genuine, not robotic
# message guidelines (for LLM prompt)
'tone': 'genuine, not salesy',
'max_words': 150, # they don't have energy for long messages
'no_pressure': True, # never pushy
'sign_off': '- connectd',
}
# === API CREDENTIALS ===
# all credentials from environment variables - no defaults
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
MASTODON_TOKEN = os.environ.get('MASTODON_TOKEN', '')
MASTODON_INSTANCE = os.environ.get('MASTODON_INSTANCE', '')
BLUESKY_HANDLE = os.environ.get('BLUESKY_HANDLE', '')
BLUESKY_APP_PASSWORD = os.environ.get('BLUESKY_APP_PASSWORD', '')
MATRIX_HOMESERVER = os.environ.get('MATRIX_HOMESERVER', '')
MATRIX_USER_ID = os.environ.get('MATRIX_USER_ID', '')
MATRIX_ACCESS_TOKEN = os.environ.get('MATRIX_ACCESS_TOKEN', '')
DISCORD_BOT_TOKEN = os.environ.get('DISCORD_BOT_TOKEN', '')
DISCORD_TARGET_SERVERS = os.environ.get('DISCORD_TARGET_SERVERS', '')
# lemmy (for authenticated access to private instance)
LEMMY_INSTANCE = os.environ.get('LEMMY_INSTANCE', '')
LEMMY_USERNAME = os.environ.get('LEMMY_USERNAME', '')
LEMMY_PASSWORD = os.environ.get('LEMMY_PASSWORD', '')
# email (for sending intros)
SMTP_HOST = os.environ.get('SMTP_HOST', '')
SMTP_PORT = int(os.environ.get('SMTP_PORT', '465'))
SMTP_USER = os.environ.get('SMTP_USER', '')
SMTP_PASS = os.environ.get('SMTP_PASS', '')
# === HOST USER CONFIG ===
# the person running connectd - gets priority matching
HOST_USER = os.environ.get('HOST_USER', '') # alias like sudoxnym
HOST_NAME = os.environ.get('HOST_NAME', '')
HOST_EMAIL = os.environ.get('HOST_EMAIL', '')
HOST_GITHUB = os.environ.get('HOST_GITHUB', '')
HOST_MASTODON = os.environ.get('HOST_MASTODON', '') # user@instance
HOST_REDDIT = os.environ.get('HOST_REDDIT', '')
HOST_LEMMY = os.environ.get('HOST_LEMMY', '') # user@instance
HOST_LOBSTERS = os.environ.get('HOST_LOBSTERS', '')
HOST_MATRIX = os.environ.get('HOST_MATRIX', '') # @user:server
HOST_DISCORD = os.environ.get('HOST_DISCORD', '') # user id
HOST_BLUESKY = os.environ.get('HOST_BLUESKY', '') # handle.bsky.social
HOST_LOCATION = os.environ.get('HOST_LOCATION', '')
HOST_INTERESTS = os.environ.get('HOST_INTERESTS', '') # comma separated
HOST_LOOKING_FOR = os.environ.get('HOST_LOOKING_FOR', '')
def get_lost_config():
"""get lost builder configuration"""
return LOST_CONFIG.copy()
def update_lost_config(updates):
"""update lost builder configuration"""
global LOST_CONFIG
LOST_CONFIG.update(updates)
return LOST_CONFIG.copy()

View file

@ -1,72 +0,0 @@
name: connectd
version: "1.1.0"
slug: connectd
description: "find isolated builders with aligned values. auto-discover humans on github, mastodon, lemmy, discord, and more."
url: "https://github.com/sudoxnym/connectd"
arch:
- amd64
- aarch64
- armv7
startup: application
boot: auto
ports:
8099/tcp: 8099
ports_description:
8099/tcp: "connectd API (for HACS integration)"
map:
- config:rw
options:
host_user: ""
host_name: ""
host_email: ""
host_mastodon: ""
host_reddit: ""
host_lemmy: ""
host_lobsters: ""
host_matrix: ""
host_discord: ""
host_bluesky: ""
host_location: ""
host_interests: ""
host_looking_for: ""
github_token: ""
groq_api_key: ""
mastodon_token: ""
mastodon_instance: ""
discord_bot_token: ""
discord_target_servers: ""
lemmy_instance: ""
lemmy_username: ""
lemmy_password: ""
smtp_host: ""
smtp_port: 465
smtp_user: ""
smtp_pass: ""
schema:
host_user: str?
host_name: str?
host_email: email?
host_mastodon: str?
host_reddit: str?
host_lemmy: str?
host_lobsters: str?
host_matrix: str?
host_discord: str?
host_bluesky: str?
host_location: str?
host_interests: str?
host_looking_for: str?
github_token: str?
groq_api_key: str?
mastodon_token: str?
mastodon_instance: str?
discord_bot_token: str?
discord_target_servers: str?
lemmy_instance: str?
lemmy_username: str?
lemmy_password: str?
smtp_host: str?
smtp_port: int?
smtp_user: str?
smtp_pass: str?
image: sudoxreboot/connectd-addon-{arch}

View file

@ -1,546 +0,0 @@
#!/usr/bin/env python3
"""
connectd daemon - continuous discovery and matchmaking
two modes of operation:
1. priority matching: find matches FOR hosts who run connectd
2. altruistic matching: connect strangers to each other
runs continuously, respects rate limits, sends intros automatically
"""
import time
import json
import signal
import sys
from datetime import datetime, timedelta
from pathlib import Path
from db import Database
from db.users import (init_users_table, get_priority_users, save_priority_match,
get_priority_user_matches, discover_host_user)
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_lemmy, scrape_discord
from config import HOST_USER
from scoutd.github import analyze_github_user, get_github_user
from scoutd.signals import analyze_text
from matchd.fingerprint import generate_fingerprint, fingerprint_similarity
from matchd.overlap import find_overlap
from matchd.lost import find_matches_for_lost_builders
from introd.draft import draft_intro, summarize_human, summarize_overlap
from introd.lost_intro import draft_lost_intro, get_lost_intro_config
from introd.send import send_email
from introd.deliver import deliver_intro, determine_best_contact
from config import get_lost_config
from api import start_api_thread, update_daemon_state
# daemon config
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
MATCH_INTERVAL = 3600 # check matches every hour
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours (lower volume)
MAX_INTROS_PER_DAY = 20 # rate limit outreach
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
class ConnectDaemon:
def __init__(self, dry_run=False):
self.db = Database()
init_users_table(self.db.conn)
self.running = True
self.dry_run = dry_run
self.started_at = datetime.now()
self.last_scout = None
self.last_match = None
self.last_intro = None
self.last_lost = None
self.intros_today = 0
self.lost_intros_today = 0
self.today = datetime.now().date()
# handle shutdown gracefully
signal.signal(signal.SIGINT, self._shutdown)
signal.signal(signal.SIGTERM, self._shutdown)
# auto-discover host user from env
if HOST_USER:
self.log(f"HOST_USER set: {HOST_USER}")
discover_host_user(self.db.conn, HOST_USER)
# update API state
self._update_api_state()
def _shutdown(self, signum, frame):
print("\nconnectd: shutting down...")
self.running = False
self._update_api_state()
def _update_api_state(self):
"""update API state for HA integration"""
now = datetime.now()
# calculate countdowns - if no cycle has run, use started_at
def secs_until(last, interval):
base = last if last else self.started_at
next_run = base + timedelta(seconds=interval)
remaining = (next_run - now).total_seconds()
return max(0, int(remaining))
update_daemon_state({
'running': self.running,
'dry_run': self.dry_run,
'last_scout': self.last_scout.isoformat() if self.last_scout else None,
'last_match': self.last_match.isoformat() if self.last_match else None,
'last_intro': self.last_intro.isoformat() if self.last_intro else None,
'last_lost': self.last_lost.isoformat() if self.last_lost else None,
'intros_today': self.intros_today,
'lost_intros_today': self.lost_intros_today,
'started_at': self.started_at.isoformat(),
'countdown_scout': secs_until(self.last_scout, SCOUT_INTERVAL),
'countdown_match': secs_until(self.last_match, MATCH_INTERVAL),
'countdown_intro': secs_until(self.last_intro, INTRO_INTERVAL),
'countdown_lost': secs_until(self.last_lost, LOST_INTERVAL),
})
def log(self, msg):
"""timestamped log"""
print(f"[{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}] {msg}")
def reset_daily_limits(self):
"""reset daily intro count"""
if datetime.now().date() != self.today:
self.today = datetime.now().date()
self.intros_today = 0
self.lost_intros_today = 0
self.log("reset daily intro limits")
def scout_cycle(self):
"""run discovery on all platforms"""
self.log("starting scout cycle...")
try:
scrape_github(self.db, limit_per_source=30)
except Exception as e:
self.log(f"github scout error: {e}")
try:
scrape_reddit(self.db, limit_per_sub=30)
except Exception as e:
self.log(f"reddit scout error: {e}")
try:
scrape_mastodon(self.db, limit_per_instance=30)
except Exception as e:
self.log(f"mastodon scout error: {e}")
try:
scrape_lobsters(self.db)
except Exception as e:
self.log(f"lobsters scout error: {e}")
try:
scrape_lemmy(self.db, limit_per_community=30)
except Exception as e:
self.log(f"lemmy scout error: {e}")
try:
scrape_discord(self.db, limit_per_channel=50)
except Exception as e:
self.log(f"discord scout error: {e}")
self.last_scout = datetime.now()
stats = self.db.stats()
self.log(f"scout complete: {stats['total_humans']} humans in db")
def match_priority_users(self):
"""find matches for priority users (hosts)"""
priority_users = get_priority_users(self.db.conn)
if not priority_users:
return
self.log(f"matching for {len(priority_users)} priority users...")
humans = self.db.get_all_humans(min_score=20, limit=500)
for puser in priority_users:
# build priority user's fingerprint from their linked profiles
puser_signals = []
puser_text = []
if puser.get('bio'):
puser_text.append(puser['bio'])
if puser.get('interests'):
interests = json.loads(puser['interests']) if isinstance(puser['interests'], str) else puser['interests']
puser_signals.extend(interests)
if puser.get('looking_for'):
puser_text.append(puser['looking_for'])
# analyze their linked github if available
if puser.get('github'):
gh_user = analyze_github_user(puser['github'])
if gh_user:
puser_signals.extend(gh_user.get('signals', []))
puser_fingerprint = {
'values_vector': {},
'skills': {},
'interests': list(set(puser_signals)),
'location_pref': 'pnw' if puser.get('location') and 'seattle' in puser['location'].lower() else None,
}
# score text
if puser_text:
_, text_signals, _ = analyze_text(' '.join(puser_text))
puser_signals.extend(text_signals)
# find matches
matches_found = 0
for human in humans:
# skip if it's their own profile on another platform
human_user = human.get('username', '').lower()
if puser.get('github') and human_user == puser['github'].lower():
continue
if puser.get('reddit') and human_user == puser['reddit'].lower():
continue
if puser.get('mastodon') and human_user == puser['mastodon'].lower().split('@')[0]:
continue
# calculate overlap
human_signals = human.get('signals', [])
if isinstance(human_signals, str):
human_signals = json.loads(human_signals)
shared = set(puser_signals) & set(human_signals)
overlap_score = len(shared) * 10
# location bonus
if puser.get('location') and human.get('location'):
if 'seattle' in human['location'].lower() or 'pnw' in human['location'].lower():
overlap_score += 20
if overlap_score >= MIN_OVERLAP_PRIORITY:
overlap_data = {
'overlap_score': overlap_score,
'overlap_reasons': [f"shared: {', '.join(list(shared)[:5])}"] if shared else [],
}
save_priority_match(self.db.conn, puser['id'], human['id'], overlap_data)
matches_found += 1
if matches_found:
self.log(f" found {matches_found} matches for {puser['name'] or puser['email']}")
def match_strangers(self):
"""find matches between discovered humans (altruistic)"""
self.log("matching strangers...")
humans = self.db.get_all_humans(min_score=40, limit=200)
if len(humans) < 2:
return
# generate fingerprints
fingerprints = {}
for human in humans:
fp = generate_fingerprint(human)
fingerprints[human['id']] = fp
# find pairs
matches_found = 0
from itertools import combinations
for human_a, human_b in combinations(humans, 2):
# skip same platform same user
if human_a['platform'] == human_b['platform']:
if human_a['username'] == human_b['username']:
continue
fp_a = fingerprints.get(human_a['id'])
fp_b = fingerprints.get(human_b['id'])
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
if overlap['overlap_score'] >= MIN_OVERLAP_STRANGERS:
# save match
self.db.save_match(human_a['id'], human_b['id'], overlap)
matches_found += 1
if matches_found:
self.log(f"found {matches_found} stranger matches")
self.last_match = datetime.now()
def send_stranger_intros(self):
"""send intros to connect strangers (or preview in dry-run mode)"""
self.reset_daily_limits()
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
self.log("daily intro limit reached")
return
# get unsent matches
c = self.db.conn.cursor()
c.execute('''SELECT m.*,
ha.id as a_id, ha.username as a_user, ha.platform as a_platform,
ha.name as a_name, ha.url as a_url, ha.contact as a_contact,
ha.signals as a_signals, ha.extra as a_extra,
hb.id as b_id, hb.username as b_user, hb.platform as b_platform,
hb.name as b_name, hb.url as b_url, hb.contact as b_contact,
hb.signals as b_signals, hb.extra as b_extra
FROM matches m
JOIN humans ha ON m.human_a_id = ha.id
JOIN humans hb ON m.human_b_id = hb.id
WHERE m.status = 'pending'
ORDER BY m.overlap_score DESC
LIMIT 10''')
matches = c.fetchall()
if self.dry_run:
self.log(f"DRY RUN: previewing {len(matches)} potential intros")
for match in matches:
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
break
match = dict(match)
# build human dicts
human_a = {
'id': match['a_id'],
'username': match['a_user'],
'platform': match['a_platform'],
'name': match['a_name'],
'url': match['a_url'],
'contact': match['a_contact'],
'signals': match['a_signals'],
'extra': match['a_extra'],
}
human_b = {
'id': match['b_id'],
'username': match['b_user'],
'platform': match['b_platform'],
'name': match['b_name'],
'url': match['b_url'],
'contact': match['b_contact'],
'signals': match['b_signals'],
'extra': match['b_extra'],
}
match_data = {
'id': match['id'],
'human_a': human_a,
'human_b': human_b,
'overlap_score': match['overlap_score'],
'overlap_reasons': match['overlap_reasons'],
}
# try to send intro to person with email
for recipient, other in [(human_a, human_b), (human_b, human_a)]:
contact = recipient.get('contact', {})
if isinstance(contact, str):
contact = json.loads(contact)
email = contact.get('email')
if not email:
continue
# draft intro
intro = draft_intro(match_data, recipient='a' if recipient == human_a else 'b')
# parse overlap reasons for display
reasons = match['overlap_reasons']
if isinstance(reasons, str):
reasons = json.loads(reasons)
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
if self.dry_run:
# print preview
print("\n" + "=" * 60)
print(f"TO: {recipient['username']} ({recipient['platform']})")
print(f"EMAIL: {email}")
print(f"SUBJECT: you might want to meet {other['username']}")
print(f"SCORE: {match['overlap_score']:.0f} ({reason_summary})")
print("-" * 60)
print("MESSAGE:")
print(intro['draft'])
print("-" * 60)
print("[DRY RUN - NOT SENT]")
print("=" * 60)
break
else:
# actually send
success, error = send_email(
email,
f"connectd: you might want to meet {other['username']}",
intro['draft']
)
if success:
self.log(f"sent intro to {recipient['username']} ({email})")
self.intros_today += 1
# mark match as intro_sent
c.execute('UPDATE matches SET status = "intro_sent" WHERE id = ?',
(match['id'],))
self.db.conn.commit()
break
else:
self.log(f"failed to send to {email}: {error}")
self.last_intro = datetime.now()
def send_lost_builder_intros(self):
"""
reach out to lost builders - different tone, lower volume.
these people need encouragement, not networking.
"""
self.reset_daily_limits()
lost_config = get_lost_config()
if not lost_config.get('enabled', True):
return
max_per_day = lost_config.get('max_per_day', 5)
if not self.dry_run and self.lost_intros_today >= max_per_day:
self.log("daily lost builder intro limit reached")
return
# find lost builders with matching active builders
matches, error = find_matches_for_lost_builders(
self.db,
min_lost_score=lost_config.get('min_lost_score', 40),
min_values_score=lost_config.get('min_values_score', 20),
limit=max_per_day - self.lost_intros_today
)
if error:
self.log(f"lost builder matching error: {error}")
return
if not matches:
self.log("no lost builders ready for outreach")
return
if self.dry_run:
self.log(f"DRY RUN: previewing {len(matches)} lost builder intros")
for match in matches:
if not self.dry_run and self.lost_intros_today >= max_per_day:
break
lost = match['lost_user']
builder = match['inspiring_builder']
lost_name = lost.get('name') or lost.get('username')
builder_name = builder.get('name') or builder.get('username')
# draft intro
draft, draft_error = draft_lost_intro(lost, builder, lost_config)
if draft_error:
self.log(f"error drafting lost intro for {lost_name}: {draft_error}")
continue
# determine best contact method (activity-based)
method, contact_info = determine_best_contact(lost)
if self.dry_run:
print("\n" + "=" * 60)
print("LOST BUILDER OUTREACH")
print("=" * 60)
print(f"TO: {lost_name} ({lost.get('platform')})")
print(f"DELIVERY: {method}{contact_info}")
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
print(f"VALUES SCORE: {lost.get('score', 0)}")
print(f"INSPIRING BUILDER: {builder_name}")
print(f"SHARED INTERESTS: {', '.join(match.get('shared_interests', []))}")
print("-" * 60)
print("MESSAGE:")
print(draft)
print("-" * 60)
print("[DRY RUN - NOT SENT]")
print("=" * 60)
else:
# build match data for unified delivery
match_data = {
'human_a': builder, # inspiring builder
'human_b': lost, # lost builder (recipient)
'overlap_score': match.get('match_score', 0),
'overlap_reasons': match.get('shared_interests', []),
}
success, error, delivery_method = deliver_intro(match_data, draft)
if success:
self.log(f"sent lost builder intro to {lost_name} via {delivery_method}")
self.lost_intros_today += 1
self.db.mark_lost_outreach(lost['id'])
else:
self.log(f"failed to reach {lost_name} via {delivery_method}: {error}")
self.last_lost = datetime.now()
self.log(f"lost builder cycle complete: {self.lost_intros_today} sent today")
def run(self):
"""main daemon loop"""
self.log("connectd daemon starting...")
# start API server
start_api_thread()
self.log("api server started on port 8099")
if self.dry_run:
self.log("*** DRY RUN MODE - no intros will be sent ***")
self.log(f"scout interval: {SCOUT_INTERVAL}s")
self.log(f"match interval: {MATCH_INTERVAL}s")
self.log(f"intro interval: {INTRO_INTERVAL}s")
self.log(f"lost interval: {LOST_INTERVAL}s")
self.log(f"max intros/day: {MAX_INTROS_PER_DAY}")
# initial scout
self.scout_cycle()
self._update_api_state()
while self.running:
now = datetime.now()
# scout cycle
if not self.last_scout or (now - self.last_scout).seconds >= SCOUT_INTERVAL:
self.scout_cycle()
self._update_api_state()
# match cycle
if not self.last_match or (now - self.last_match).seconds >= MATCH_INTERVAL:
self.match_priority_users()
self.match_strangers()
self._update_api_state()
# intro cycle
if not self.last_intro or (now - self.last_intro).seconds >= INTRO_INTERVAL:
self.send_stranger_intros()
self._update_api_state()
# lost builder cycle
if not self.last_lost or (now - self.last_lost).seconds >= LOST_INTERVAL:
self.send_lost_builder_intros()
self._update_api_state()
# sleep between checks
time.sleep(60)
self.log("connectd daemon stopped")
self.db.close()
def run_daemon(dry_run=False):
"""entry point"""
daemon = ConnectDaemon(dry_run=dry_run)
daemon.run()
if __name__ == '__main__':
import sys
dry_run = '--dry-run' in sys.argv
run_daemon(dry_run=dry_run)

View file

@ -1,510 +0,0 @@
"""
priority users - people who host connectd get direct matching
"""
import sqlite3
import json
from datetime import datetime
from pathlib import Path
DB_PATH = Path(__file__).parent / 'connectd.db'
# map user-friendly interests to signal terms
INTEREST_TO_SIGNALS = {
'self-hosting': ['selfhosted', 'home_automation'],
'home-assistant': ['home_automation'],
'intentional-community': ['community', 'cooperative'],
'cooperatives': ['cooperative', 'community'],
'solarpunk': ['solarpunk'],
'privacy': ['privacy', 'local_first'],
'local-first': ['local_first', 'privacy'],
'queer-friendly': ['queer'],
'anti-capitalism': ['cooperative', 'decentralized', 'community'],
'esports-venue': [],
'foss': ['foss'],
'decentralized': ['decentralized'],
'federated': ['federated_chat'],
'mesh': ['mesh'],
}
def init_users_table(conn):
"""create priority users table"""
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS priority_users (
id INTEGER PRIMARY KEY,
name TEXT,
email TEXT UNIQUE,
github TEXT,
reddit TEXT,
mastodon TEXT,
lobsters TEXT,
matrix TEXT,
lemmy TEXT,
discord TEXT,
bluesky TEXT,
location TEXT,
bio TEXT,
interests TEXT,
looking_for TEXT,
created_at TEXT,
active INTEGER DEFAULT 1,
score REAL DEFAULT 0,
signals TEXT,
scraped_profile TEXT,
last_scored_at TEXT
)''')
# add missing columns to existing table
for col in ['lemmy', 'discord', 'bluesky']:
try:
c.execute(f'ALTER TABLE priority_users ADD COLUMN {col} TEXT')
except:
pass # column already exists
# matches specifically for priority users
c.execute('''CREATE TABLE IF NOT EXISTS priority_matches (
id INTEGER PRIMARY KEY,
priority_user_id INTEGER,
matched_human_id INTEGER,
overlap_score REAL,
overlap_reasons TEXT,
status TEXT DEFAULT 'new',
notified_at TEXT,
viewed_at TEXT,
FOREIGN KEY(priority_user_id) REFERENCES priority_users(id),
FOREIGN KEY(matched_human_id) REFERENCES humans(id)
)''')
conn.commit()
def add_priority_user(conn, user_data):
"""add a priority user (someone hosting connectd)"""
c = conn.cursor()
c.execute('''INSERT OR REPLACE INTO priority_users
(name, email, github, reddit, mastodon, lobsters, matrix, lemmy, discord, bluesky,
location, bio, interests, looking_for, created_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)''',
(user_data.get('name'),
user_data.get('email'),
user_data.get('github'),
user_data.get('reddit'),
user_data.get('mastodon'),
user_data.get('lobsters'),
user_data.get('matrix'),
user_data.get('lemmy'),
user_data.get('discord'),
user_data.get('bluesky'),
user_data.get('location'),
user_data.get('bio'),
json.dumps(user_data.get('interests', [])),
user_data.get('looking_for'),
datetime.now().isoformat()))
conn.commit()
return c.lastrowid
def get_priority_users(conn):
"""get all active priority users"""
c = conn.cursor()
c.execute('SELECT * FROM priority_users WHERE active = 1')
return [dict(row) for row in c.fetchall()]
def get_priority_user(conn, user_id):
"""get a specific priority user"""
c = conn.cursor()
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
row = c.fetchone()
return dict(row) if row else None
def save_priority_match(conn, priority_user_id, human_id, overlap_data):
"""save a match for a priority user"""
c = conn.cursor()
c.execute('''INSERT OR IGNORE INTO priority_matches
(priority_user_id, matched_human_id, overlap_score, overlap_reasons, status)
VALUES (?, ?, ?, ?, 'new')''',
(priority_user_id, human_id,
overlap_data.get('overlap_score', 0),
json.dumps(overlap_data.get('overlap_reasons', []))))
conn.commit()
return c.lastrowid
def get_priority_user_matches(conn, priority_user_id, status=None, limit=50):
"""get matches for a priority user"""
c = conn.cursor()
if status:
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ? AND pm.status = ?
ORDER BY pm.overlap_score DESC
LIMIT ?''', (priority_user_id, status, limit))
else:
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
JOIN humans h ON pm.matched_human_id = h.id
WHERE pm.priority_user_id = ?
ORDER BY pm.overlap_score DESC
LIMIT ?''', (priority_user_id, limit))
return [dict(row) for row in c.fetchall()]
def mark_match_viewed(conn, match_id):
"""mark a priority match as viewed"""
c = conn.cursor()
c.execute('''UPDATE priority_matches SET status = 'viewed', viewed_at = ?
WHERE id = ?''', (datetime.now().isoformat(), match_id))
conn.commit()
def expand_interests_to_signals(interests):
"""expand user-friendly interests to signal terms"""
signals = set()
for interest in interests:
interest_lower = interest.lower().strip()
if interest_lower in INTEREST_TO_SIGNALS:
signals.update(INTEREST_TO_SIGNALS[interest_lower])
else:
signals.add(interest_lower)
# always add these aligned signals for priority users
signals.update(['foss', 'decentralized', 'federated_chat', 'containers', 'unix', 'selfhosted'])
return list(signals)
def score_priority_user(conn, user_id, scraped_profile=None):
"""
calculate a score for a priority user based on:
- their stated interests
- their scraped github profile (if available)
- their repos and activity
"""
c = conn.cursor()
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
row = c.fetchone()
if not row:
return None
user = dict(row)
score = 0
signals = set()
# 1. score from stated interests
interests = user.get('interests')
if isinstance(interests, str):
interests = json.loads(interests) if interests else []
for interest in interests:
interest_lower = interest.lower()
# high-value interests
if 'solarpunk' in interest_lower:
score += 30
signals.add('solarpunk')
if 'queer' in interest_lower:
score += 30
signals.add('queer')
if 'cooperative' in interest_lower or 'intentional' in interest_lower:
score += 20
signals.add('cooperative')
if 'privacy' in interest_lower:
score += 10
signals.add('privacy')
if 'self-host' in interest_lower or 'selfhost' in interest_lower:
score += 15
signals.add('selfhosted')
if 'home-assistant' in interest_lower:
score += 15
signals.add('home_automation')
if 'foss' in interest_lower or 'open source' in interest_lower:
score += 10
signals.add('foss')
# 2. score from scraped profile
if scraped_profile:
# repos
repos = scraped_profile.get('top_repos', [])
if len(repos) >= 20:
score += 20
elif len(repos) >= 10:
score += 10
elif len(repos) >= 5:
score += 5
# languages
languages = scraped_profile.get('languages', {})
if 'Python' in languages or 'Rust' in languages:
score += 5
signals.add('modern_lang')
# topics from repos
topics = scraped_profile.get('topics', [])
for topic in topics:
if topic in ['self-hosted', 'home-assistant', 'privacy', 'foss']:
score += 10
signals.add(topic.replace('-', '_'))
# followers
followers = scraped_profile.get('followers', 0)
if followers >= 100:
score += 15
elif followers >= 50:
score += 10
elif followers >= 10:
score += 5
# 3. add expanded signals
expanded = expand_interests_to_signals(interests)
signals.update(expanded)
# update user
c.execute('''UPDATE priority_users
SET score = ?, signals = ?, scraped_profile = ?, last_scored_at = ?
WHERE id = ?''',
(score, json.dumps(list(signals)), json.dumps(scraped_profile) if scraped_profile else None,
datetime.now().isoformat(), user_id))
conn.commit()
return {'score': score, 'signals': list(signals)}
def auto_match_priority_user(conn, user_id, min_overlap=40):
"""
automatically find and save matches for a priority user
uses relationship filtering to skip already-connected people
"""
from scoutd.deep import check_already_connected
c = conn.cursor()
# get user
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
row = c.fetchone()
if not row:
return []
user = dict(row)
# get user signals
user_signals = set()
if user.get('signals'):
signals = json.loads(user['signals']) if isinstance(user['signals'], str) else user['signals']
user_signals.update(signals)
# also expand interests
if user.get('interests'):
interests = json.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
user_signals.update(expand_interests_to_signals(interests))
# clear old matches
c.execute('DELETE FROM priority_matches WHERE priority_user_id = ?', (user_id,))
conn.commit()
# get all humans
c.execute('SELECT * FROM humans WHERE score >= 25')
columns = [d[0] for d in c.description]
matches = []
for row in c.fetchall():
human = dict(zip(columns, row))
# skip own profiles
username = (human.get('username') or '').lower()
if user.get('github') and username == user['github'].lower():
continue
if user.get('reddit') and username == user.get('reddit', '').lower():
continue
# check if already connected
user_human = {'username': user.get('github'), 'platform': 'github', 'extra': {}}
connected, reason = check_already_connected(user_human, human)
if connected:
continue
# get human signals
human_signals = human.get('signals', [])
if isinstance(human_signals, str):
human_signals = json.loads(human_signals) if human_signals else []
# calculate overlap
shared = user_signals & set(human_signals)
overlap_score = len(shared) * 10
# high-value bonuses
if 'queer' in human_signals:
overlap_score += 40
shared.add('queer (rare!)')
if 'solarpunk' in human_signals:
overlap_score += 30
shared.add('solarpunk (rare!)')
if 'cooperative' in human_signals:
overlap_score += 20
shared.add('cooperative (values)')
# location bonus
location = (human.get('location') or '').lower()
user_location = (user.get('location') or '').lower()
if user_location and location:
if any(x in location for x in ['seattle', 'portland', 'pnw', 'washington', 'oregon']):
if 'seattle' in user_location or 'pnw' in user_location:
overlap_score += 25
shared.add('PNW location!')
if overlap_score >= min_overlap:
matches.append({
'human': human,
'overlap_score': overlap_score,
'shared': list(shared),
})
# sort and save top matches
matches.sort(key=lambda x: x['overlap_score'], reverse=True)
for m in matches[:50]: # save top 50
save_priority_match(conn, user_id, m['human']['id'], {
'overlap_score': m['overlap_score'],
'overlap_reasons': m['shared'],
})
return matches
def update_priority_user_profile(conn, user_id, profile_data):
"""update a priority user's profile with new data"""
c = conn.cursor()
updates = []
values = []
for field in ['name', 'email', 'github', 'reddit', 'mastodon', 'lobsters',
'matrix', 'lemmy', 'discord', 'bluesky', 'location', 'bio', 'looking_for']:
if field in profile_data and profile_data[field]:
updates.append(f'{field} = ?')
values.append(profile_data[field])
if 'interests' in profile_data:
updates.append('interests = ?')
values.append(json.dumps(profile_data['interests']))
if updates:
values.append(user_id)
c.execute(f'''UPDATE priority_users SET {', '.join(updates)} WHERE id = ?''', values)
conn.commit()
return True
def discover_host_user(conn, alias):
"""
auto-discover a host user by their alias (username).
scrapes github and discovers all connected social handles.
also merges in HOST_ env vars from config for manual overrides.
returns the priority user id
"""
from scoutd.github import analyze_github_user
from config import (HOST_NAME, HOST_EMAIL, HOST_GITHUB, HOST_MASTODON,
HOST_REDDIT, HOST_LEMMY, HOST_LOBSTERS, HOST_MATRIX,
HOST_DISCORD, HOST_BLUESKY, HOST_LOCATION, HOST_INTERESTS, HOST_LOOKING_FOR)
print(f"connectd: discovering host user '{alias}'...")
# scrape github for full profile
profile = analyze_github_user(alias)
if not profile:
print(f" could not find github user '{alias}'")
# still create from env vars if no github found
profile = {'name': HOST_NAME or alias, 'bio': '', 'location': HOST_LOCATION,
'contact': {}, 'extra': {'handles': {}}, 'topics': [], 'signals': []}
print(f" found: {profile.get('name')} ({alias})")
print(f" score: {profile.get('score', 0)}, signals: {len(profile.get('signals', []))}")
# extract contact info
contact = profile.get('contact', {})
handles = profile.get('extra', {}).get('handles', {})
# merge in HOST_ env vars (override discovered values)
if HOST_MASTODON:
handles['mastodon'] = HOST_MASTODON
if HOST_REDDIT:
handles['reddit'] = HOST_REDDIT
if HOST_LEMMY:
handles['lemmy'] = HOST_LEMMY
if HOST_LOBSTERS:
handles['lobsters'] = HOST_LOBSTERS
if HOST_MATRIX:
handles['matrix'] = HOST_MATRIX
if HOST_DISCORD:
handles['discord'] = HOST_DISCORD
if HOST_BLUESKY:
handles['bluesky'] = HOST_BLUESKY
# check if user already exists
c = conn.cursor()
c.execute('SELECT id FROM priority_users WHERE github = ?', (alias,))
existing = c.fetchone()
# parse HOST_INTERESTS if provided
interests = profile.get('topics', [])
if HOST_INTERESTS:
interests = [i.strip() for i in HOST_INTERESTS.split(',') if i.strip()]
user_data = {
'name': HOST_NAME or profile.get('name') or alias,
'email': HOST_EMAIL or contact.get('email'),
'github': HOST_GITHUB or alias,
'reddit': handles.get('reddit'),
'mastodon': handles.get('mastodon') or contact.get('mastodon'),
'lobsters': handles.get('lobsters'),
'matrix': handles.get('matrix') or contact.get('matrix'),
'lemmy': handles.get('lemmy') or contact.get('lemmy'),
'discord': handles.get('discord'),
'bluesky': handles.get('bluesky') or contact.get('bluesky'),
'location': HOST_LOCATION or profile.get('location'),
'bio': profile.get('bio'),
'interests': interests,
'looking_for': HOST_LOOKING_FOR,
}
if existing:
# update existing user
user_id = existing['id']
update_priority_user_profile(conn, user_id, user_data)
print(f" updated existing priority user (id={user_id})")
else:
# create new user
user_id = add_priority_user(conn, user_data)
print(f" created new priority user (id={user_id})")
# score the user
scraped_profile = {
'top_repos': profile.get('extra', {}).get('top_repos', []),
'languages': profile.get('languages', {}),
'topics': profile.get('topics', []),
'followers': profile.get('extra', {}).get('followers', 0),
}
score_result = score_priority_user(conn, user_id, scraped_profile)
print(f" scored: {score_result.get('score')}, {len(score_result.get('signals', []))} signals")
# print discovered handles
print(f" discovered handles:")
for platform, handle in handles.items():
print(f" {platform}: {handle}")
return user_id
def get_host_user(conn):
"""get the host user (first priority user)"""
users = get_priority_users(conn)
return users[0] if users else None

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 MiB

View file

@ -1,10 +0,0 @@
"""
introd - outreach module
drafts intros, queues for human review, sends via appropriate channel
"""
from .draft import draft_intro
from .review import get_pending_intros, approve_intro, reject_intro
from .send import send_intro
__all__ = ['draft_intro', 'get_pending_intros', 'approve_intro', 'reject_intro', 'send_intro']

View file

@ -1,210 +0,0 @@
"""
introd/draft.py - AI writes intro messages referencing both parties' work
"""
import json
# intro template - transparent about being AI, neutral third party
INTRO_TEMPLATE = """hi {recipient_name},
i'm an AI that connects isolated builders working on similar things.
you're building: {recipient_summary}
{other_name} is building: {other_summary}
overlap: {overlap_summary}
thought you might benefit from knowing each other.
their work: {other_url}
no pitch. just connection. ignore if not useful.
- connectd
"""
# shorter version for platforms with character limits
SHORT_TEMPLATE = """hi {recipient_name} - i'm an AI connecting aligned builders.
you: {recipient_summary}
{other_name}: {other_summary}
overlap: {overlap_summary}
their work: {other_url}
no pitch, just connection.
"""
def summarize_human(human_data):
"""generate a brief summary of what someone is building/interested in"""
parts = []
# name or username
name = human_data.get('name') or human_data.get('username', 'unknown')
# platform context
platform = human_data.get('platform', '')
# signals/interests
signals = human_data.get('signals', [])
if isinstance(signals, str):
signals = json.loads(signals)
# extra data
extra = human_data.get('extra', {})
if isinstance(extra, str):
extra = json.loads(extra)
# build summary based on available data
topics = extra.get('topics', [])
languages = list(extra.get('languages', {}).keys())[:3]
repo_count = extra.get('repo_count', 0)
subreddits = extra.get('subreddits', [])
if platform == 'github':
if topics:
parts.append(f"working on {', '.join(topics[:3])}")
if languages:
parts.append(f"using {', '.join(languages)}")
if repo_count > 10:
parts.append(f"({repo_count} repos)")
elif platform == 'reddit':
if subreddits:
parts.append(f"active in r/{', r/'.join(subreddits[:3])}")
elif platform == 'mastodon':
instance = extra.get('instance', '')
if instance:
parts.append(f"on {instance}")
elif platform == 'lobsters':
karma = extra.get('karma', 0)
if karma > 50:
parts.append(f"active on lobste.rs ({karma} karma)")
# add key signals
key_signals = [s for s in signals if s in ['selfhosted', 'privacy', 'cooperative',
'solarpunk', 'intentional_community',
'home_automation', 'foss']]
if key_signals:
parts.append(f"interested in {', '.join(key_signals[:3])}")
if not parts:
parts.append(f"builder on {platform}")
return ' | '.join(parts)
def summarize_overlap(overlap_data):
"""generate overlap summary"""
reasons = overlap_data.get('overlap_reasons', [])
if isinstance(reasons, str):
reasons = json.loads(reasons)
if reasons:
return ' | '.join(reasons[:3])
# fallback
shared = overlap_data.get('shared_signals', [])
if shared:
return f"shared interests: {', '.join(shared[:3])}"
return "aligned values and interests"
def draft_intro(match_data, recipient='a'):
"""
draft an intro message for a match
match_data: dict with human_a, human_b, overlap info
recipient: 'a' or 'b' - who receives this intro
returns: dict with draft text, channel, metadata
"""
if recipient == 'a':
recipient_human = match_data['human_a']
other_human = match_data['human_b']
else:
recipient_human = match_data['human_b']
other_human = match_data['human_a']
# get names
recipient_name = recipient_human.get('name') or recipient_human.get('username', 'friend')
other_name = other_human.get('name') or other_human.get('username', 'someone')
# generate summaries
recipient_summary = summarize_human(recipient_human)
other_summary = summarize_human(other_human)
overlap_summary = summarize_overlap(match_data)
# other's url
other_url = other_human.get('url', '')
# determine best channel
contact = recipient_human.get('contact', {})
if isinstance(contact, str):
contact = json.loads(contact)
channel = None
channel_address = None
# prefer email if available
if contact.get('email'):
channel = 'email'
channel_address = contact['email']
# github issue/discussion
elif recipient_human.get('platform') == 'github':
channel = 'github'
channel_address = recipient_human.get('url')
# mastodon DM
elif recipient_human.get('platform') == 'mastodon':
channel = 'mastodon'
channel_address = recipient_human.get('username')
# reddit message
elif recipient_human.get('platform') == 'reddit':
channel = 'reddit'
channel_address = recipient_human.get('username')
else:
channel = 'manual'
channel_address = recipient_human.get('url')
# choose template based on channel
if channel in ['mastodon', 'reddit']:
template = SHORT_TEMPLATE
else:
template = INTRO_TEMPLATE
# render draft
draft = template.format(
recipient_name=recipient_name.split()[0] if recipient_name else 'friend', # first name only
recipient_summary=recipient_summary,
other_name=other_name.split()[0] if other_name else 'someone',
other_summary=other_summary,
overlap_summary=overlap_summary,
other_url=other_url,
)
return {
'recipient_human': recipient_human,
'other_human': other_human,
'channel': channel,
'channel_address': channel_address,
'draft': draft,
'overlap_score': match_data.get('overlap_score', 0),
'match_id': match_data.get('id'),
}
def draft_intros_for_match(match_data):
"""
draft intros for both parties in a match
returns list of two intro dicts
"""
intro_a = draft_intro(match_data, recipient='a')
intro_b = draft_intro(match_data, recipient='b')
return [intro_a, intro_b]

View file

@ -1,250 +0,0 @@
"""
introd/lost_intro.py - intro drafting for lost builders
different tone than builder-to-builder intros.
these people need encouragement, not networking.
the goal isn't to recruit them. it's to show them the door exists.
they take it or they don't. but they'll know someone saw them.
"""
import os
import json
import requests
from datetime import datetime
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
LOST_INTRO_TEMPLATE = """hey {name},
i'm connectd. i'm a daemon that finds people who might need a nudge.
i noticed you're interested in {interests}. you ask good questions. you clearly get it.
but maybe you haven't built anything yet. or you started and stopped. or you don't think you can.
that's okay. most people don't.
but some people do. here's one: {builder_name} ({builder_url})
{builder_description}
they started where you are. look at what they built.
you're not behind. you're just not started yet.
no pressure. just wanted you to know someone noticed.
- connectd"""
SYSTEM_PROMPT = """you are connectd, a daemon that finds isolated builders with aligned values and connects them.
right now you're reaching out to someone who has POTENTIAL but hasn't found it yet. maybe they gave up, maybe they're stuck, maybe they don't believe they can do it.
your job is to:
1. acknowledge where they are without being condescending
2. point them to an active builder who could inspire them
3. be genuine, not salesy or motivational-speaker-y
4. keep it short - these people are tired, don't overwhelm them
5. use lowercase, be human, no corporate bullshit
6. make it clear there's no pressure, no follow-up spam
you're not recruiting. you're not selling. you're just showing them a door.
the template structure:
- acknowledge them (you noticed something about them)
- normalize where they are (most people don't build things)
- show them someone who did (the builder)
- brief encouragement (you're not behind, just not started)
- sign off with no pressure
do NOT:
- be preachy or lecture them
- use motivational cliches ("you got this!", "believe in yourself!")
- make promises about outcomes
- be too long - they don't have energy for long messages
- make them feel bad about where they are"""
def draft_lost_intro(lost_user, inspiring_builder, config=None):
"""
draft an intro for a lost builder, pairing them with an inspiring active builder.
lost_user: the person who needs a nudge
inspiring_builder: an active builder with similar interests who could inspire them
"""
config = config or {}
# gather info about lost user
lost_name = lost_user.get('name') or lost_user.get('username', 'there')
lost_signals = lost_user.get('lost_signals', [])
lost_interests = extract_interests(lost_user)
# gather info about inspiring builder
builder_name = inspiring_builder.get('name') or inspiring_builder.get('username')
builder_url = inspiring_builder.get('url') or f"https://github.com/{inspiring_builder.get('username')}"
builder_description = create_builder_description(inspiring_builder)
# use LLM to personalize
if GROQ_API_KEY and config.get('use_llm', True):
return draft_with_llm(lost_user, inspiring_builder, lost_interests, builder_description)
# fallback to template
return LOST_INTRO_TEMPLATE.format(
name=lost_name,
interests=', '.join(lost_interests[:3]) if lost_interests else 'building things',
builder_name=builder_name,
builder_url=builder_url,
builder_description=builder_description,
), None
def extract_interests(user):
"""extract interests from user profile"""
interests = []
# from topics/tags
extra = user.get('extra', {})
if isinstance(extra, str):
try:
extra = json.loads(extra)
except:
extra = {}
topics = extra.get('topics', []) or extra.get('aligned_topics', [])
interests.extend(topics[:5])
# from subreddits
subreddits = user.get('subreddits', [])
for sub in subreddits[:3]:
if sub.lower() not in ['learnprogramming', 'findapath', 'getdisciplined']:
interests.append(sub)
# from bio keywords
bio = user.get('bio') or ''
bio_lower = bio.lower()
interest_keywords = [
'rust', 'python', 'javascript', 'go', 'linux', 'self-hosting', 'homelab',
'privacy', 'security', 'open source', 'foss', 'decentralized', 'ai', 'ml',
'web dev', 'backend', 'frontend', 'devops', 'data', 'automation',
]
for kw in interest_keywords:
if kw in bio_lower and kw not in interests:
interests.append(kw)
return interests[:5] if interests else ['technology', 'building things']
def create_builder_description(builder):
"""create a brief description of what the builder has done"""
extra = builder.get('extra', {})
if isinstance(extra, str):
try:
extra = json.loads(extra)
except:
extra = {}
parts = []
# what they build
repos = extra.get('top_repos', [])[:3]
if repos:
repo_names = [r.get('name') for r in repos if r.get('name')]
if repo_names:
parts.append(f"they've built things like {', '.join(repo_names[:2])}")
# their focus
topics = extra.get('aligned_topics', []) or extra.get('topics', [])
if topics:
parts.append(f"they work on {', '.join(topics[:3])}")
# their vibe
signals = builder.get('signals', [])
if 'self-hosted' in str(signals).lower():
parts.append("they're into self-hosting and owning their own infrastructure")
if 'privacy' in str(signals).lower():
parts.append("they care about privacy")
if 'community' in str(signals).lower():
parts.append("they're community-focused")
if parts:
return '. '.join(parts) + '.'
else:
return "they're building cool stuff in the open."
def draft_with_llm(lost_user, inspiring_builder, interests, builder_description):
"""use LLM to draft personalized intro"""
lost_name = lost_user.get('name') or lost_user.get('username', 'there')
lost_signals = lost_user.get('lost_signals', [])
lost_bio = lost_user.get('bio', '')
builder_name = inspiring_builder.get('name') or inspiring_builder.get('username')
builder_url = inspiring_builder.get('url') or f"https://github.com/{inspiring_builder.get('username')}"
user_prompt = f"""draft an intro for this lost builder:
LOST USER:
- name: {lost_name}
- interests: {', '.join(interests)}
- signals detected: {', '.join(lost_signals[:5]) if lost_signals else 'general stuck/aspiring patterns'}
- bio: {lost_bio[:200] if lost_bio else 'none'}
INSPIRING BUILDER TO SHOW THEM:
- name: {builder_name}
- url: {builder_url}
- what they do: {builder_description}
write a short, genuine message. no fluff. no motivational cliches. just human.
keep it under 150 words.
use lowercase.
end with "- connectd"
"""
try:
resp = requests.post(
GROQ_API_URL,
headers={
'Authorization': f'Bearer {GROQ_API_KEY}',
'Content-Type': 'application/json',
},
json={
'model': MODEL,
'messages': [
{'role': 'system', 'content': SYSTEM_PROMPT},
{'role': 'user', 'content': user_prompt},
],
'temperature': 0.7,
'max_tokens': 500,
},
timeout=30,
)
if resp.status_code == 200:
content = resp.json()['choices'][0]['message']['content']
return content.strip(), None
else:
return None, f"llm error: {resp.status_code}"
except Exception as e:
return None, str(e)
def get_lost_intro_config():
"""get configuration for lost builder outreach"""
return {
'enabled': True,
'max_per_day': 5, # lower volume, higher care
'require_review': True, # always manual approval
'cooldown_days': 90, # don't spam struggling people
'min_lost_score': 40,
'min_values_score': 20,
'use_llm': True,
}

View file

@ -1,126 +0,0 @@
"""
introd/review.py - human approval queue before sending
"""
import json
from datetime import datetime
def get_pending_intros(db, limit=50):
"""
get all intros pending human review
returns list of intro dicts with full context
"""
rows = db.get_pending_intros(limit=limit)
intros = []
for row in rows:
# get associated match and humans
match_id = row.get('match_id')
recipient_id = row.get('recipient_human_id')
recipient = db.get_human_by_id(recipient_id) if recipient_id else None
intros.append({
'id': row['id'],
'match_id': match_id,
'recipient': recipient,
'channel': row.get('channel'),
'draft': row.get('draft'),
'status': row.get('status'),
})
return intros
def approve_intro(db, intro_id, approved_by='human'):
"""
approve an intro for sending
intro_id: database id of the intro
approved_by: who approved it (for audit trail)
"""
db.approve_intro(intro_id, approved_by)
print(f"introd: approved intro {intro_id} by {approved_by}")
def reject_intro(db, intro_id, reason=None):
"""
reject an intro (won't be sent)
"""
c = db.conn.cursor()
c.execute('''UPDATE intros SET status = 'rejected',
approved_at = ?, approved_by = ? WHERE id = ?''',
(datetime.now().isoformat(), f"rejected: {reason}" if reason else "rejected", intro_id))
db.conn.commit()
print(f"introd: rejected intro {intro_id}")
def review_intro_interactive(db, intro):
"""
interactive review of a single intro
returns: 'approve', 'reject', 'edit', or 'skip'
"""
print("\n" + "=" * 60)
print("INTRO FOR REVIEW")
print("=" * 60)
recipient = intro.get('recipient', {})
print(f"\nRecipient: {recipient.get('name') or recipient.get('username')}")
print(f"Platform: {recipient.get('platform')}")
print(f"Channel: {intro.get('channel')}")
print(f"\n--- DRAFT ---")
print(intro.get('draft'))
print("--- END ---\n")
while True:
choice = input("[a]pprove / [r]eject / [s]kip / [e]dit? ").strip().lower()
if choice in ['a', 'approve']:
approve_intro(db, intro['id'])
return 'approve'
elif choice in ['r', 'reject']:
reason = input("reason (optional): ").strip()
reject_intro(db, intro['id'], reason)
return 'reject'
elif choice in ['s', 'skip']:
return 'skip'
elif choice in ['e', 'edit']:
print("editing not yet implemented - approve or reject")
else:
print("invalid choice")
def review_all_pending(db):
"""
interactive review of all pending intros
"""
intros = get_pending_intros(db)
if not intros:
print("no pending intros to review")
return
print(f"\n{len(intros)} intros pending review\n")
approved = 0
rejected = 0
skipped = 0
for intro in intros:
result = review_intro_interactive(db, intro)
if result == 'approve':
approved += 1
elif result == 'reject':
rejected += 1
else:
skipped += 1
cont = input("\ncontinue reviewing? [y/n] ").strip().lower()
if cont != 'y':
break
print(f"\nreview complete: {approved} approved, {rejected} rejected, {skipped} skipped")

View file

@ -1,216 +0,0 @@
"""
introd/send.py - actually deliver intros via appropriate channel
"""
import smtplib
import requests
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from datetime import datetime
import os
# email config (from env)
SMTP_HOST = os.environ.get('SMTP_HOST', '')
SMTP_PORT = int(os.environ.get('SMTP_PORT', '465'))
SMTP_USER = os.environ.get('SMTP_USER', '')
SMTP_PASS = os.environ.get('SMTP_PASS', '')
FROM_EMAIL = os.environ.get('FROM_EMAIL', '')
def send_email(to_email, subject, body):
"""send email via SMTP"""
msg = MIMEMultipart()
msg['From'] = FROM_EMAIL
msg['To'] = to_email
msg['Subject'] = subject
msg.attach(MIMEText(body, 'plain'))
try:
with smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT) as server:
server.login(SMTP_USER, SMTP_PASS)
server.send_message(msg)
return True, None
except Exception as e:
return False, str(e)
def send_github_issue(repo_url, title, body):
"""
create a github issue (requires GITHUB_TOKEN)
note: only works if you have write access to the repo
typically won't work for random users - fallback to manual
"""
# extract owner/repo from url
# https://github.com/owner/repo -> owner/repo
parts = repo_url.rstrip('/').split('/')
if len(parts) < 2:
return False, "invalid github url"
owner = parts[-2]
repo = parts[-1]
token = os.environ.get('GITHUB_TOKEN')
if not token:
return False, "no github token"
# would create issue via API - but this is invasive
# better to just output the info for manual action
return False, "github issues not automated - use manual outreach"
def send_mastodon_dm(instance, username, message):
"""
send mastodon DM (requires account credentials)
not implemented - requires oauth setup
"""
return False, "mastodon DMs not automated - use manual outreach"
def send_reddit_message(username, subject, body):
"""
send reddit message (requires account credentials)
not implemented - requires oauth setup
"""
return False, "reddit messages not automated - use manual outreach"
def send_intro(db, intro_id):
"""
send an approved intro
returns: (success, error_message)
"""
# get intro from db
c = db.conn.cursor()
c.execute('SELECT * FROM intros WHERE id = ?', (intro_id,))
row = c.fetchone()
if not row:
return False, "intro not found"
intro = dict(row)
if intro['status'] != 'approved':
return False, f"intro not approved (status: {intro['status']})"
channel = intro.get('channel')
draft = intro.get('draft')
# get recipient info
recipient = db.get_human_by_id(intro['recipient_human_id'])
if not recipient:
return False, "recipient not found"
success = False
error = None
if channel == 'email':
# get email from contact
import json
contact = recipient.get('contact', {})
if isinstance(contact, str):
contact = json.loads(contact)
email = contact.get('email')
if email:
success, error = send_email(
email,
"connection: aligned builder intro",
draft
)
else:
error = "no email address"
elif channel == 'github':
success, error = send_github_issue(
recipient.get('url'),
"connection: aligned builder intro",
draft
)
elif channel == 'mastodon':
success, error = send_mastodon_dm(
recipient.get('instance'),
recipient.get('username'),
draft
)
elif channel == 'reddit':
success, error = send_reddit_message(
recipient.get('username'),
"connection: aligned builder intro",
draft
)
else:
error = f"unknown channel: {channel}"
# update status
if success:
db.mark_intro_sent(intro_id)
print(f"introd: sent intro {intro_id} via {channel}")
else:
# mark as needs manual sending
c.execute('''UPDATE intros SET status = 'manual_needed',
approved_at = ? WHERE id = ?''',
(datetime.now().isoformat(), intro_id))
db.conn.commit()
print(f"introd: intro {intro_id} needs manual send ({error})")
return success, error
def send_all_approved(db):
"""
send all approved intros
"""
c = db.conn.cursor()
c.execute('SELECT id FROM intros WHERE status = "approved"')
rows = c.fetchall()
if not rows:
print("no approved intros to send")
return
print(f"sending {len(rows)} approved intros...")
sent = 0
failed = 0
for row in rows:
success, error = send_intro(db, row['id'])
if success:
sent += 1
else:
failed += 1
print(f"sent: {sent}, failed/manual: {failed}")
def export_manual_intros(db, output_file='manual_intros.txt'):
"""
export intros that need manual sending to a text file
"""
c = db.conn.cursor()
c.execute('''SELECT i.*, h.username, h.platform, h.url
FROM intros i
JOIN humans h ON i.recipient_human_id = h.id
WHERE i.status IN ('approved', 'manual_needed')''')
rows = c.fetchall()
if not rows:
print("no intros to export")
return
with open(output_file, 'w') as f:
for row in rows:
f.write("=" * 60 + "\n")
f.write(f"TO: {row['username']} ({row['platform']})\n")
f.write(f"URL: {row['url']}\n")
f.write(f"CHANNEL: {row['channel']}\n")
f.write("-" * 60 + "\n")
f.write(row['draft'] + "\n")
f.write("\n")
print(f"exported {len(rows)} intros to {output_file}")

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 MiB

View file

@ -1,10 +0,0 @@
"""
matchd - pairing module
generates fingerprints, finds overlaps, ranks matches
"""
from .fingerprint import generate_fingerprint
from .overlap import find_overlap
from .rank import rank_matches, find_all_matches
__all__ = ['generate_fingerprint', 'find_overlap', 'rank_matches', 'find_all_matches']

View file

@ -1,210 +0,0 @@
"""
matchd/fingerprint.py - generate values profiles for humans
"""
import json
from collections import defaultdict
# values dimensions we track
VALUES_DIMENSIONS = [
'privacy', # surveillance concern, degoogle, self-hosted
'decentralization', # p2p, fediverse, local-first
'cooperation', # coops, mutual aid, community
'queer_friendly', # lgbtq+, pronouns
'environmental', # solarpunk, degrowth, sustainability
'anticapitalist', # post-capitalism, worker ownership
'builder', # creates vs consumes
'pnw_oriented', # pacific northwest connection
]
# skill categories
SKILL_CATEGORIES = [
'backend', # python, go, rust, databases
'frontend', # js, react, css
'devops', # docker, k8s, linux admin
'hardware', # electronics, embedded, iot
'design', # ui/ux, graphics
'community', # organizing, facilitation
'writing', # documentation, content
]
# signal to dimension mapping
SIGNAL_TO_DIMENSION = {
'privacy': 'privacy',
'selfhosted': 'privacy',
'degoogle': 'privacy',
'decentralized': 'decentralization',
'local_first': 'decentralization',
'p2p': 'decentralization',
'federated_chat': 'decentralization',
'foss': 'decentralization',
'cooperative': 'cooperation',
'community': 'cooperation',
'mutual_aid': 'cooperation',
'intentional_community': 'cooperation',
'queer': 'queer_friendly',
'pronouns': 'queer_friendly',
'blm': 'queer_friendly',
'acab': 'queer_friendly',
'solarpunk': 'environmental',
'anticapitalist': 'anticapitalist',
'pnw': 'pnw_oriented',
'pnw_state': 'pnw_oriented',
'remote': 'pnw_oriented',
'home_automation': 'builder',
'modern_lang': 'builder',
'unix': 'builder',
'containers': 'builder',
}
# language to skill mapping
LANGUAGE_TO_SKILL = {
'python': 'backend',
'go': 'backend',
'rust': 'backend',
'java': 'backend',
'ruby': 'backend',
'php': 'backend',
'javascript': 'frontend',
'typescript': 'frontend',
'html': 'frontend',
'css': 'frontend',
'vue': 'frontend',
'shell': 'devops',
'dockerfile': 'devops',
'nix': 'devops',
'hcl': 'devops',
'c': 'hardware',
'c++': 'hardware',
'arduino': 'hardware',
'verilog': 'hardware',
}
def generate_fingerprint(human_data):
"""
generate a values fingerprint for a human
input: human dict from database (has signals, languages, etc)
output: fingerprint dict with values_vector, skills, interests
"""
# parse stored json fields
signals = human_data.get('signals', [])
if isinstance(signals, str):
signals = json.loads(signals)
extra = human_data.get('extra', {})
if isinstance(extra, str):
extra = json.loads(extra)
languages = extra.get('languages', {})
topics = extra.get('topics', [])
# build values vector
values_vector = defaultdict(float)
# from signals
for signal in signals:
dimension = SIGNAL_TO_DIMENSION.get(signal)
if dimension:
values_vector[dimension] += 1.0
# normalize values vector (0-1 scale)
max_val = max(values_vector.values()) if values_vector else 1
values_vector = {k: min(v / max_val, 1.0) for k, v in values_vector.items()}
# fill in missing dimensions with 0
for dim in VALUES_DIMENSIONS:
if dim not in values_vector:
values_vector[dim] = 0.0
# determine skills from languages
skills = defaultdict(float)
total_repos = sum(languages.values()) if languages else 1
for lang, count in languages.items():
skill = LANGUAGE_TO_SKILL.get(lang.lower())
if skill:
skills[skill] += count / total_repos
# normalize skills
if skills:
max_skill = max(skills.values())
skills = {k: min(v / max_skill, 1.0) for k, v in skills.items()}
# interests from topics and signals
interests = list(set(topics + signals))
# location preference
location_pref = None
if 'pnw' in signals or 'pnw_state' in signals:
location_pref = 'pnw'
elif 'remote' in signals:
location_pref = 'remote'
elif human_data.get('location'):
loc = human_data['location'].lower()
if any(x in loc for x in ['seattle', 'portland', 'washington', 'oregon', 'pnw', 'cascadia']):
location_pref = 'pnw'
# availability (based on hireable flag if present)
availability = None
if extra.get('hireable'):
availability = 'open'
return {
'human_id': human_data.get('id'),
'values_vector': dict(values_vector),
'skills': dict(skills),
'interests': interests,
'location_pref': location_pref,
'availability': availability,
}
def fingerprint_similarity(fp_a, fp_b):
"""
calculate similarity between two fingerprints
returns 0-1 score
"""
# values similarity (cosine-ish)
va = fp_a.get('values_vector', {})
vb = fp_b.get('values_vector', {})
all_dims = set(va.keys()) | set(vb.keys())
if not all_dims:
return 0.0
dot_product = sum(va.get(d, 0) * vb.get(d, 0) for d in all_dims)
mag_a = sum(v**2 for v in va.values()) ** 0.5
mag_b = sum(v**2 for v in vb.values()) ** 0.5
if mag_a == 0 or mag_b == 0:
values_sim = 0.0
else:
values_sim = dot_product / (mag_a * mag_b)
# interest overlap (jaccard)
ia = set(fp_a.get('interests', []))
ib = set(fp_b.get('interests', []))
if ia or ib:
interest_sim = len(ia & ib) / len(ia | ib)
else:
interest_sim = 0.0
# location compatibility
loc_a = fp_a.get('location_pref')
loc_b = fp_b.get('location_pref')
loc_sim = 0.0
if loc_a == loc_b and loc_a is not None:
loc_sim = 1.0
elif loc_a == 'remote' or loc_b == 'remote':
loc_sim = 0.5
elif loc_a == 'pnw' or loc_b == 'pnw':
loc_sim = 0.3
# weighted combination
similarity = (values_sim * 0.5) + (interest_sim * 0.3) + (loc_sim * 0.2)
return similarity

View file

@ -1,199 +0,0 @@
"""
matchd/lost.py - lost builder matching
lost builders don't get matched to each other (both need energy).
they get matched to ACTIVE builders who can inspire them.
the goal: show them someone like them who made it.
"""
import json
from .overlap import find_overlap, is_same_person
def find_inspiring_builder(lost_user, active_builders, db=None):
"""
find an active builder who could inspire a lost builder.
criteria:
- shared interests (they need to relate to this person)
- active builder has shipped real work (proof it's possible)
- similar background signals if possible
- NOT the same person across platforms
"""
if not active_builders:
return None, "no active builders available"
# parse lost user data
lost_signals = lost_user.get('signals', [])
if isinstance(lost_signals, str):
lost_signals = json.loads(lost_signals) if lost_signals else []
lost_extra = lost_user.get('extra', {})
if isinstance(lost_extra, str):
lost_extra = json.loads(lost_extra) if lost_extra else {}
# lost user interests
lost_interests = set()
lost_interests.update(lost_signals)
lost_interests.update(lost_extra.get('topics', []))
lost_interests.update(lost_extra.get('aligned_topics', []))
# also include subreddits if from reddit (shows interests)
subreddits = lost_user.get('subreddits', [])
if isinstance(subreddits, str):
subreddits = json.loads(subreddits) if subreddits else []
lost_interests.update(subreddits)
# score each active builder
candidates = []
for builder in active_builders:
# skip if same person (cross-platform)
if is_same_person(lost_user, builder):
continue
# get builder signals
builder_signals = builder.get('signals', [])
if isinstance(builder_signals, str):
builder_signals = json.loads(builder_signals) if builder_signals else []
builder_extra = builder.get('extra', {})
if isinstance(builder_extra, str):
builder_extra = json.loads(builder_extra) if builder_extra else {}
# builder interests
builder_interests = set()
builder_interests.update(builder_signals)
builder_interests.update(builder_extra.get('topics', []))
builder_interests.update(builder_extra.get('aligned_topics', []))
# calculate match score
shared_interests = lost_interests & builder_interests
match_score = len(shared_interests) * 10
# bonus for high-value shared signals
high_value_signals = ['privacy', 'selfhosted', 'home_automation', 'foss',
'solarpunk', 'cooperative', 'decentralized', 'queer']
for signal in shared_interests:
if signal in high_value_signals:
match_score += 15
# bonus if builder has shipped real work (proof it's possible)
repos = builder_extra.get('top_repos', [])
if len(repos) >= 5:
match_score += 20 # they've built things
elif len(repos) >= 2:
match_score += 10
# bonus for high stars (visible success)
total_stars = sum(r.get('stars', 0) for r in repos) if repos else 0
if total_stars >= 100:
match_score += 15
elif total_stars >= 20:
match_score += 5
# bonus for similar location (relatable)
lost_loc = (lost_user.get('location') or '').lower()
builder_loc = (builder.get('location') or '').lower()
if lost_loc and builder_loc:
pnw_keywords = ['seattle', 'portland', 'washington', 'oregon', 'pnw']
if any(k in lost_loc for k in pnw_keywords) and any(k in builder_loc for k in pnw_keywords):
match_score += 10
# minimum threshold - need SOMETHING in common
if match_score < 10:
continue
candidates.append({
'builder': builder,
'match_score': match_score,
'shared_interests': list(shared_interests)[:5],
'repos_count': len(repos),
'total_stars': total_stars,
})
if not candidates:
return None, "no matching active builders found"
# sort by match score, return best
candidates.sort(key=lambda x: x['match_score'], reverse=True)
best = candidates[0]
return best, None
def find_matches_for_lost_builders(db, min_lost_score=40, min_values_score=20, limit=10):
"""
find inspiring builder matches for all lost builders ready for outreach.
returns list of (lost_user, inspiring_builder, match_data)
"""
# get lost builders ready for outreach
lost_builders = db.get_lost_builders_for_outreach(
min_lost_score=min_lost_score,
min_values_score=min_values_score,
limit=limit
)
if not lost_builders:
return [], "no lost builders ready for outreach"
# get active builders who can inspire
active_builders = db.get_active_builders(min_score=50, limit=200)
if not active_builders:
return [], "no active builders available"
matches = []
for lost_user in lost_builders:
best_match, error = find_inspiring_builder(lost_user, active_builders, db)
if best_match:
matches.append({
'lost_user': lost_user,
'inspiring_builder': best_match['builder'],
'match_score': best_match['match_score'],
'shared_interests': best_match['shared_interests'],
'builder_repos': best_match['repos_count'],
'builder_stars': best_match['total_stars'],
})
return matches, None
def get_lost_match_summary(match_data):
"""
get a human-readable summary of a lost builder match.
"""
lost = match_data['lost_user']
builder = match_data['inspiring_builder']
lost_name = lost.get('name') or lost.get('username', 'someone')
builder_name = builder.get('name') or builder.get('username', 'a builder')
lost_signals = match_data.get('lost_signals', [])
if isinstance(lost_signals, str):
lost_signals = json.loads(lost_signals) if lost_signals else []
shared = match_data.get('shared_interests', [])
summary = f"""
lost builder: {lost_name} ({lost.get('platform')})
lost score: {lost.get('lost_potential_score', 0)}
values score: {lost.get('score', 0)}
url: {lost.get('url')}
inspiring builder: {builder_name} ({builder.get('platform')})
score: {builder.get('score', 0)}
repos: {match_data.get('builder_repos', 0)}
stars: {match_data.get('builder_stars', 0)}
url: {builder.get('url')}
match score: {match_data.get('match_score', 0)}
shared interests: {', '.join(shared) if shared else 'values alignment'}
this lost builder needs to see that someone like them made it.
"""
return summary.strip()

View file

@ -1,150 +0,0 @@
"""
matchd/overlap.py - find pairs with alignment
"""
import json
from .fingerprint import fingerprint_similarity
def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
"""
analyze overlap between two humans
returns overlap details: score, shared values, complementary skills
"""
# parse stored json if needed
signals_a = human_a.get('signals', [])
if isinstance(signals_a, str):
signals_a = json.loads(signals_a)
signals_b = human_b.get('signals', [])
if isinstance(signals_b, str):
signals_b = json.loads(signals_b)
extra_a = human_a.get('extra', {})
if isinstance(extra_a, str):
extra_a = json.loads(extra_a)
extra_b = human_b.get('extra', {})
if isinstance(extra_b, str):
extra_b = json.loads(extra_b)
# shared signals
shared_signals = list(set(signals_a) & set(signals_b))
# shared topics
topics_a = set(extra_a.get('topics', []))
topics_b = set(extra_b.get('topics', []))
shared_topics = list(topics_a & topics_b)
# complementary skills (what one has that the other doesn't)
langs_a = set(extra_a.get('languages', {}).keys())
langs_b = set(extra_b.get('languages', {}).keys())
complementary_langs = list((langs_a - langs_b) | (langs_b - langs_a))
# geographic compatibility
loc_a = human_a.get('location', '').lower() if human_a.get('location') else ''
loc_b = human_b.get('location', '').lower() if human_b.get('location') else ''
pnw_keywords = ['seattle', 'portland', 'washington', 'oregon', 'pnw', 'cascadia', 'pacific northwest']
remote_keywords = ['remote', 'anywhere', 'distributed']
a_pnw = any(k in loc_a for k in pnw_keywords) or 'pnw' in signals_a
b_pnw = any(k in loc_b for k in pnw_keywords) or 'pnw' in signals_b
a_remote = any(k in loc_a for k in remote_keywords) or 'remote' in signals_a
b_remote = any(k in loc_b for k in remote_keywords) or 'remote' in signals_b
geographic_match = False
geo_reason = None
if a_pnw and b_pnw:
geographic_match = True
geo_reason = 'both in pnw'
elif (a_pnw or b_pnw) and (a_remote or b_remote):
geographic_match = True
geo_reason = 'pnw + remote compatible'
elif a_remote and b_remote:
geographic_match = True
geo_reason = 'both remote-friendly'
# calculate overlap score
base_score = 0
# shared values (most important)
base_score += len(shared_signals) * 10
# shared interests
base_score += len(shared_topics) * 5
# complementary skills bonus (they can help each other)
if complementary_langs:
base_score += min(len(complementary_langs), 5) * 3
# geographic bonus
if geographic_match:
base_score += 20
# fingerprint similarity if available
fp_score = 0
if fp_a and fp_b:
fp_score = fingerprint_similarity(fp_a, fp_b) * 50
total_score = base_score + fp_score
# build reasons
overlap_reasons = []
if shared_signals:
overlap_reasons.append(f"shared values: {', '.join(shared_signals[:5])}")
if shared_topics:
overlap_reasons.append(f"shared interests: {', '.join(shared_topics[:5])}")
if geo_reason:
overlap_reasons.append(geo_reason)
if complementary_langs:
overlap_reasons.append(f"complementary skills: {', '.join(complementary_langs[:5])}")
return {
'overlap_score': total_score,
'shared_signals': shared_signals,
'shared_topics': shared_topics,
'complementary_skills': complementary_langs,
'geographic_match': geographic_match,
'geo_reason': geo_reason,
'overlap_reasons': overlap_reasons,
'fingerprint_similarity': fp_score / 50 if fp_a and fp_b else None,
}
def is_same_person(human_a, human_b):
"""
check if two records might be the same person (cross-platform)
"""
# same platform = definitely different records
if human_a['platform'] == human_b['platform']:
return False
# check username similarity
user_a = human_a.get('username', '').lower().split('@')[0]
user_b = human_b.get('username', '').lower().split('@')[0]
if user_a == user_b:
return True
# check if github username matches
contact_a = human_a.get('contact', {})
contact_b = human_b.get('contact', {})
if isinstance(contact_a, str):
contact_a = json.loads(contact_a)
if isinstance(contact_b, str):
contact_b = json.loads(contact_b)
# github cross-reference
if contact_a.get('github') and contact_a.get('github') == contact_b.get('github'):
return True
if contact_a.get('github') == user_b or contact_b.get('github') == user_a:
return True
# email cross-reference
if contact_a.get('email') and contact_a.get('email') == contact_b.get('email'):
return True
return False

View file

@ -1,137 +0,0 @@
"""
matchd/rank.py - score and rank match quality
"""
from itertools import combinations
from .fingerprint import generate_fingerprint
from .overlap import find_overlap, is_same_person
from scoutd.deep import check_already_connected
def rank_matches(matches):
"""
rank a list of matches by quality
returns sorted list with quality scores
"""
ranked = []
for match in matches:
# base score from overlap
score = match.get('overlap_score', 0)
# bonus for geographic match
if match.get('geographic_match'):
score *= 1.2
# bonus for high fingerprint similarity
fp_sim = match.get('fingerprint_similarity')
if fp_sim and fp_sim > 0.7:
score *= 1.3
# bonus for complementary skills
comp_skills = match.get('complementary_skills', [])
if len(comp_skills) >= 3:
score *= 1.1
match['quality_score'] = score
ranked.append(match)
# sort by quality score
ranked.sort(key=lambda x: x['quality_score'], reverse=True)
return ranked
def find_all_matches(db, min_score=30, min_overlap=20):
"""
find all potential matches from database
returns list of match dicts
"""
print("matchd: finding all potential matches...")
# get all humans above threshold
humans = db.get_all_humans(min_score=min_score)
print(f" {len(humans)} humans to match")
# generate fingerprints
fingerprints = {}
for human in humans:
fp = generate_fingerprint(human)
fingerprints[human['id']] = fp
db.save_fingerprint(human['id'], fp)
print(f" generated {len(fingerprints)} fingerprints")
# find all pairs
matches = []
checked = 0
skipped_same = 0
skipped_connected = 0
for human_a, human_b in combinations(humans, 2):
checked += 1
# skip if likely same person
if is_same_person(human_a, human_b):
skipped_same += 1
continue
# skip if already connected (same org, company, co-contributors)
connected, reason = check_already_connected(human_a, human_b)
if connected:
skipped_connected += 1
continue
# calculate overlap
fp_a = fingerprints.get(human_a['id'])
fp_b = fingerprints.get(human_b['id'])
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
if overlap['overlap_score'] >= min_overlap:
match = {
'human_a': human_a,
'human_b': human_b,
**overlap
}
matches.append(match)
# save to db
db.save_match(human_a['id'], human_b['id'], overlap)
if checked % 1000 == 0:
print(f" checked {checked} pairs, {len(matches)} matches so far...")
print(f" checked {checked} pairs")
print(f" skipped {skipped_same} (same person), {skipped_connected} (already connected)")
print(f" found {len(matches)} potential matches")
# rank them
ranked = rank_matches(matches)
return ranked
def get_top_matches(db, limit=50):
"""
get top matches from database
"""
match_rows = db.get_matches(limit=limit)
matches = []
for row in match_rows:
human_a = db.get_human_by_id(row['human_a_id'])
human_b = db.get_human_by_id(row['human_b_id'])
if human_a and human_b:
matches.append({
'id': row['id'],
'human_a': human_a,
'human_b': human_b,
'overlap_score': row['overlap_score'],
'overlap_reasons': row['overlap_reasons'],
'geographic_match': row['geographic_match'],
'status': row['status'],
})
return matches

View file

@ -1,3 +0,0 @@
name: connectd add-ons
url: https://github.com/sudoxnym/connectd
maintainer: sudoxnym

View file

@ -1,2 +0,0 @@
requests>=2.28.0
beautifulsoup4>=4.12.0

View file

@ -1,45 +0,0 @@
#!/usr/bin/with-contenv bashio
# shellcheck shell=bash
# read options from add-on config
export HOST_USER=$(bashio::config 'host_user')
export HOST_NAME=$(bashio::config 'host_name')
export HOST_EMAIL=$(bashio::config 'host_email')
export HOST_MASTODON=$(bashio::config 'host_mastodon')
export HOST_REDDIT=$(bashio::config 'host_reddit')
export HOST_LEMMY=$(bashio::config 'host_lemmy')
export HOST_LOBSTERS=$(bashio::config 'host_lobsters')
export HOST_MATRIX=$(bashio::config 'host_matrix')
export HOST_DISCORD=$(bashio::config 'host_discord')
export HOST_BLUESKY=$(bashio::config 'host_bluesky')
export HOST_LOCATION=$(bashio::config 'host_location')
export HOST_INTERESTS=$(bashio::config 'host_interests')
export HOST_LOOKING_FOR=$(bashio::config 'host_looking_for')
export GITHUB_TOKEN=$(bashio::config 'github_token')
export GROQ_API_KEY=$(bashio::config 'groq_api_key')
export MASTODON_TOKEN=$(bashio::config 'mastodon_token')
export MASTODON_INSTANCE=$(bashio::config 'mastodon_instance')
export DISCORD_BOT_TOKEN=$(bashio::config 'discord_bot_token')
export DISCORD_TARGET_SERVERS=$(bashio::config 'discord_target_servers')
export LEMMY_INSTANCE=$(bashio::config 'lemmy_instance')
export LEMMY_USERNAME=$(bashio::config 'lemmy_username')
export LEMMY_PASSWORD=$(bashio::config 'lemmy_password')
export SMTP_HOST=$(bashio::config 'smtp_host')
export SMTP_PORT=$(bashio::config 'smtp_port')
export SMTP_USER=$(bashio::config 'smtp_user')
export SMTP_PASS=$(bashio::config 'smtp_pass')
# set data paths
export DB_PATH=/data/db/connectd.db
export CACHE_DIR=/data/cache
bashio::log.info "starting connectd daemon..."
bashio::log.info "HOST_USER: ${HOST_USER}"
cd /app
exec python3 daemon.py

View file

@ -1,29 +0,0 @@
"""
scoutd - discovery module
finds humans across platforms
"""
from .github import scrape_github, get_github_user
from .reddit import scrape_reddit
from .mastodon import scrape_mastodon
from .lobsters import scrape_lobsters
from .matrix import scrape_matrix
from .twitter import scrape_twitter
from .bluesky import scrape_bluesky
from .lemmy import scrape_lemmy
from .discord import scrape_discord, send_discord_dm
from .deep import (
deep_scrape_github_user, check_already_connected, save_deep_profile,
determine_contact_method, get_cached_orgs, cache_orgs,
get_emails_from_commit_history, scrape_website_for_emails,
)
__all__ = [
'scrape_github', 'scrape_reddit', 'scrape_mastodon', 'scrape_lobsters',
'scrape_matrix', 'scrape_twitter', 'scrape_bluesky', 'scrape_lemmy',
'scrape_discord', 'send_discord_dm',
'get_github_user', 'deep_scrape_github_user',
'check_already_connected', 'save_deep_profile', 'determine_contact_method',
'get_cached_orgs', 'cache_orgs', 'get_emails_from_commit_history',
'scrape_website_for_emails',
]

View file

@ -1,216 +0,0 @@
"""
scoutd/bluesky.py - bluesky/atproto discovery
bluesky has an open API via AT Protocol - no auth needed for public data
many twitter refugees landed here, good source for aligned builders
"""
import requests
import json
import time
from datetime import datetime
from pathlib import Path
from .signals import analyze_text
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'bluesky'
# public bluesky API
BSKY_API = 'https://public.api.bsky.app'
# hashtags to search
ALIGNED_HASHTAGS = [
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
'privacy', 'solarpunk', 'cooperative', 'mutualaid', 'localfirst',
'indieweb', 'smallweb', 'permacomputing', 'techworkers', 'coops',
]
def _api_get(endpoint, params=None):
"""rate-limited API request with caching"""
url = f"{BSKY_API}{endpoint}"
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
time.sleep(0.5) # rate limit
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" bluesky api error: {e}")
return None
def search_posts(query, limit=50):
"""search for posts containing query"""
result = _api_get('/xrpc/app.bsky.feed.searchPosts', {
'q': query,
'limit': min(limit, 100),
})
if not result:
return []
posts = result.get('posts', [])
return posts
def get_profile(handle):
"""get user profile by handle (e.g., user.bsky.social)"""
result = _api_get('/xrpc/app.bsky.actor.getProfile', {'actor': handle})
return result
def get_author_feed(handle, limit=30):
"""get user's recent posts"""
result = _api_get('/xrpc/app.bsky.feed.getAuthorFeed', {
'actor': handle,
'limit': limit,
})
if not result:
return []
return result.get('feed', [])
def analyze_bluesky_user(handle):
"""analyze a bluesky user for alignment"""
profile = get_profile(handle)
if not profile:
return None
# collect text
text_parts = []
# bio/description
description = profile.get('description', '')
if description:
text_parts.append(description)
display_name = profile.get('displayName', '')
if display_name:
text_parts.append(display_name)
# recent posts
feed = get_author_feed(handle, limit=20)
for item in feed:
post = item.get('post', {})
record = post.get('record', {})
text = record.get('text', '')
if text:
text_parts.append(text)
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# bluesky bonus (decentralized, values-aligned platform choice)
platform_bonus = 10
total_score = text_score + platform_bonus
# activity bonus
followers = profile.get('followersCount', 0)
posts_count = profile.get('postsCount', 0)
if posts_count >= 100:
total_score += 5
if followers >= 100:
total_score += 5
# confidence
confidence = 0.35 # base for bluesky (better signal than twitter)
if len(text_parts) > 5:
confidence += 0.2
if len(positive_signals) >= 3:
confidence += 0.2
if posts_count >= 50:
confidence += 0.1
confidence = min(confidence, 0.85)
reasons = ['on bluesky (atproto)']
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
return {
'platform': 'bluesky',
'username': handle,
'url': f"https://bsky.app/profile/{handle}",
'name': display_name or handle,
'bio': description,
'score': total_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'followers': followers,
'posts_count': posts_count,
'reasons': reasons,
'contact': {
'bluesky': handle,
},
'scraped_at': datetime.now().isoformat(),
}
def scrape_bluesky(db, limit_per_hashtag=30):
"""full bluesky scrape"""
print("scoutd/bluesky: starting scrape...")
all_users = {}
for hashtag in ALIGNED_HASHTAGS:
print(f" #{hashtag}...")
# search for hashtag
posts = search_posts(f"#{hashtag}", limit=limit_per_hashtag)
for post in posts:
author = post.get('author', {})
handle = author.get('handle')
if handle and handle not in all_users:
all_users[handle] = {
'handle': handle,
'display_name': author.get('displayName'),
'hashtags': [hashtag],
}
elif handle:
all_users[handle]['hashtags'].append(hashtag)
print(f" found {len(posts)} posts")
# prioritize users in multiple hashtags
multi_hashtag = {h: d for h, d in all_users.items() if len(d.get('hashtags', [])) >= 2}
print(f" {len(multi_hashtag)} users in 2+ aligned hashtags")
# analyze
results = []
for handle in list(multi_hashtag.keys())[:100]:
try:
result = analyze_bluesky_user(handle)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
if result['score'] >= 30:
print(f" ★ @{handle}: {result['score']} pts")
except Exception as e:
print(f" error on {handle}: {e}")
print(f"scoutd/bluesky: found {len(results)} aligned humans")
return results

View file

@ -1,323 +0,0 @@
"""
scoutd/discord.py - discord discovery
discord requires a bot token to read messages.
target servers: programming help, career transition, indie hackers, etc.
SETUP:
1. create discord app at discord.com/developers
2. add bot, get token
3. join target servers with bot
4. set DISCORD_BOT_TOKEN env var
"""
import requests
import json
import time
import os
from datetime import datetime
from pathlib import Path
from .signals import analyze_text
from .lost import (
analyze_social_for_lost_signals,
classify_user,
)
DISCORD_BOT_TOKEN = os.environ.get('DISCORD_BOT_TOKEN', '')
DISCORD_API = 'https://discord.com/api/v10'
# default server IDs - values-aligned communities
# bot must be invited to these servers to scout them
# invite links for reference (use numeric IDs below):
# - self-hosted: discord.gg/self-hosted
# - foss-dev: discord.gg/foss-developers-group
# - grapheneos: discord.gg/grapheneos
# - queer-coded: discord.me/queer-coded
# - homelab: discord.gg/homelab
# - esphome: discord.gg/n9sdw7pnsn
# - home-assistant: discord.gg/home-assistant
# - linuxserver: discord.gg/linuxserver
# - proxmox-scripts: discord.gg/jsYVk5JBxq
DEFAULT_SERVERS = [
# self-hosted / foss / privacy
'693469700109369394', # self-hosted (selfhosted.show)
'920089648842293248', # foss developers group
'1176414688112820234', # grapheneos
# queer tech
'925804557001437184', # queer coded
# home automation / homelab
# note: these are large servers, bot needs to be invited
# '330944238910963714', # home assistant (150k+ members)
# '429907082951524364', # esphome (35k members)
# '478094546522079232', # homelab (35k members)
# '354974912613449730', # linuxserver.io (41k members)
]
# merge env var servers with defaults
_env_servers = os.environ.get('DISCORD_TARGET_SERVERS', '').split(',')
_env_servers = [s.strip() for s in _env_servers if s.strip()]
TARGET_SERVERS = list(set(DEFAULT_SERVERS + _env_servers))
# channels to focus on (keywords in channel name)
TARGET_CHANNEL_KEYWORDS = [
'help', 'career', 'jobs', 'learning', 'beginner',
'general', 'introductions', 'showcase', 'projects',
]
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'discord'
CACHE_DIR.mkdir(parents=True, exist_ok=True)
def get_headers():
"""get discord api headers"""
if not DISCORD_BOT_TOKEN:
return None
return {
'Authorization': f'Bot {DISCORD_BOT_TOKEN}',
'Content-Type': 'application/json',
}
def get_guild_channels(guild_id):
"""get channels in a guild"""
headers = get_headers()
if not headers:
return []
try:
resp = requests.get(
f'{DISCORD_API}/guilds/{guild_id}/channels',
headers=headers,
timeout=30
)
if resp.status_code == 200:
return resp.json()
return []
except Exception:
return []
def get_channel_messages(channel_id, limit=100):
"""get recent messages from a channel"""
headers = get_headers()
if not headers:
return []
try:
resp = requests.get(
f'{DISCORD_API}/channels/{channel_id}/messages',
headers=headers,
params={'limit': limit},
timeout=30
)
if resp.status_code == 200:
return resp.json()
return []
except Exception:
return []
def get_user_info(user_id):
"""get discord user info"""
headers = get_headers()
if not headers:
return None
try:
resp = requests.get(
f'{DISCORD_API}/users/{user_id}',
headers=headers,
timeout=30
)
if resp.status_code == 200:
return resp.json()
return None
except Exception:
return None
def analyze_discord_user(user_data, messages=None):
"""analyze a discord user for values alignment and lost signals"""
username = user_data.get('username', '')
display_name = user_data.get('global_name') or username
user_id = user_data.get('id')
# analyze messages
all_signals = []
all_text = []
total_score = 0
if messages:
for msg in messages[:20]:
content = msg.get('content', '')
if not content or len(content) < 20:
continue
all_text.append(content)
score, signals, _ = analyze_text(content)
all_signals.extend(signals)
total_score += score
all_signals = list(set(all_signals))
# lost builder detection
profile_for_lost = {
'bio': '',
'message_count': len(messages) if messages else 0,
}
posts_for_lost = [{'text': t} for t in all_text]
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
lost_potential_score = lost_weight
user_type = classify_user(lost_potential_score, 50, total_score)
return {
'platform': 'discord',
'username': username,
'url': f"https://discord.com/users/{user_id}",
'name': display_name,
'bio': '',
'location': None,
'score': total_score,
'confidence': min(0.8, 0.2 + len(all_signals) * 0.1),
'signals': all_signals,
'negative_signals': [],
'reasons': [],
'contact': {'discord': f"{username}#{user_data.get('discriminator', '0')}"},
'extra': {
'user_id': user_id,
'message_count': len(messages) if messages else 0,
},
'lost_potential_score': lost_potential_score,
'lost_signals': lost_signals,
'user_type': user_type,
}
def scrape_discord(db, limit_per_channel=50):
"""scrape discord servers for aligned builders"""
if not DISCORD_BOT_TOKEN:
print("discord: DISCORD_BOT_TOKEN not set, skipping")
return 0
if not TARGET_SERVERS or TARGET_SERVERS == ['']:
print("discord: DISCORD_TARGET_SERVERS not set, skipping")
return 0
print("scouting discord...")
found = 0
lost_found = 0
seen_users = set()
for guild_id in TARGET_SERVERS:
if not guild_id:
continue
guild_id = guild_id.strip()
channels = get_guild_channels(guild_id)
if not channels:
print(f" guild {guild_id}: no access or no channels")
continue
# filter to relevant channels
target_channels = []
for ch in channels:
if ch.get('type') != 0: # text channels only
continue
name = ch.get('name', '').lower()
if any(kw in name for kw in TARGET_CHANNEL_KEYWORDS):
target_channels.append(ch)
print(f" guild {guild_id}: {len(target_channels)} relevant channels")
for channel in target_channels[:5]: # limit channels per server
messages = get_channel_messages(channel['id'], limit=limit_per_channel)
if not messages:
continue
# group messages by user
user_messages = {}
for msg in messages:
author = msg.get('author', {})
if author.get('bot'):
continue
user_id = author.get('id')
if not user_id or user_id in seen_users:
continue
if user_id not in user_messages:
user_messages[user_id] = {'user': author, 'messages': []}
user_messages[user_id]['messages'].append(msg)
# analyze each user
for user_id, data in user_messages.items():
if user_id in seen_users:
continue
seen_users.add(user_id)
result = analyze_discord_user(data['user'], data['messages'])
if not result:
continue
if result['score'] >= 20 or result.get('lost_potential_score', 0) >= 30:
db.save_human(result)
found += 1
if result.get('user_type') in ['lost', 'both']:
lost_found += 1
time.sleep(1) # rate limit between channels
time.sleep(2) # between guilds
print(f"discord: found {found} humans ({lost_found} lost builders)")
return found
def send_discord_dm(user_id, message, dry_run=False):
"""send a DM to a discord user"""
if not DISCORD_BOT_TOKEN:
return False, "DISCORD_BOT_TOKEN not set"
if dry_run:
print(f" [dry run] would DM discord user {user_id}")
return True, "dry run"
headers = get_headers()
try:
# create DM channel
dm_resp = requests.post(
f'{DISCORD_API}/users/@me/channels',
headers=headers,
json={'recipient_id': user_id},
timeout=30
)
if dm_resp.status_code not in [200, 201]:
return False, f"couldn't create DM channel: {dm_resp.status_code}"
channel_id = dm_resp.json().get('id')
# send message
msg_resp = requests.post(
f'{DISCORD_API}/channels/{channel_id}/messages',
headers=headers,
json={'content': message},
timeout=30
)
if msg_resp.status_code in [200, 201]:
return True, f"sent to {user_id}"
else:
return False, f"send failed: {msg_resp.status_code}"
except Exception as e:
return False, str(e)

View file

@ -1,330 +0,0 @@
"""
scoutd/github.py - github discovery
scrapes repos, bios, commit patterns to find aligned builders
also detects lost builders - people with potential who haven't started yet
"""
import requests
import json
import time
import os
from datetime import datetime
from pathlib import Path
from collections import defaultdict
from .signals import analyze_text, TARGET_TOPICS, ECOSYSTEM_REPOS
from .lost import (
analyze_github_for_lost_signals,
analyze_text_for_lost_signals,
classify_user,
get_signal_descriptions,
)
from .handles import discover_all_handles
# rate limit: 60/hr unauthenticated, 5000/hr with token
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
HEADERS = {'Accept': 'application/vnd.github.v3+json'}
if GITHUB_TOKEN:
HEADERS['Authorization'] = f'token {GITHUB_TOKEN}'
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'github'
def _api_get(url, params=None):
"""rate-limited api request with caching"""
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
# check cache (1 hour expiry)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
# rate limit
time.sleep(0.5 if GITHUB_TOKEN else 2)
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
# cache
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" github api error: {e}")
return None
def search_repos_by_topic(topic, per_page=100):
"""search repos by topic tag"""
url = 'https://api.github.com/search/repositories'
params = {'q': f'topic:{topic}', 'sort': 'stars', 'order': 'desc', 'per_page': per_page}
data = _api_get(url, params)
return data.get('items', []) if data else []
def get_repo_contributors(repo_full_name, per_page=100):
"""get top contributors to a repo"""
url = f'https://api.github.com/repos/{repo_full_name}/contributors'
return _api_get(url, {'per_page': per_page}) or []
def get_github_user(login):
"""get full user profile"""
url = f'https://api.github.com/users/{login}'
return _api_get(url)
def get_user_repos(login, per_page=100):
"""get user's repos"""
url = f'https://api.github.com/users/{login}/repos'
return _api_get(url, {'per_page': per_page, 'sort': 'pushed'}) or []
def analyze_github_user(login):
"""
analyze a github user for values alignment
returns dict with score, confidence, signals, contact info
"""
user = get_github_user(login)
if not user:
return None
repos = get_user_repos(login)
# collect text corpus
text_parts = []
if user.get('bio'):
text_parts.append(user['bio'])
if user.get('company'):
text_parts.append(user['company'])
if user.get('location'):
text_parts.append(user['location'])
# analyze repos
all_topics = []
languages = defaultdict(int)
total_stars = 0
for repo in repos:
if repo.get('description'):
text_parts.append(repo['description'])
if repo.get('topics'):
all_topics.extend(repo['topics'])
if repo.get('language'):
languages[repo['language']] += 1
total_stars += repo.get('stargazers_count', 0)
full_text = ' '.join(text_parts)
# analyze signals
text_score, positive_signals, negative_signals = analyze_text(full_text)
# topic alignment
aligned_topics = set(all_topics) & set(TARGET_TOPICS)
topic_score = len(aligned_topics) * 10
# builder score (repos indicate building, not just talking)
builder_score = 0
if len(repos) > 20:
builder_score = 15
elif len(repos) > 10:
builder_score = 10
elif len(repos) > 5:
builder_score = 5
# hireable bonus
hireable_score = 5 if user.get('hireable') else 0
# total score
total_score = text_score + topic_score + builder_score + hireable_score
# === LOST BUILDER DETECTION ===
# build profile dict for lost analysis
profile_for_lost = {
'bio': user.get('bio'),
'repos': repos,
'public_repos': user.get('public_repos', len(repos)),
'followers': user.get('followers', 0),
'following': user.get('following', 0),
'extra': {
'top_repos': repos[:10],
},
}
# analyze for lost signals
lost_signals, lost_weight = analyze_github_for_lost_signals(profile_for_lost)
# also check text for lost language patterns
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
for sig in text_lost_signals:
if sig not in lost_signals:
lost_signals.append(sig)
lost_weight += text_lost_weight
lost_potential_score = lost_weight
# classify: builder, lost, both, or none
user_type = classify_user(lost_potential_score, builder_score, total_score)
# confidence based on data richness
confidence = 0.3
if user.get('bio'):
confidence += 0.15
if len(repos) > 5:
confidence += 0.15
if len(text_parts) > 5:
confidence += 0.15
if user.get('email') or user.get('blog') or user.get('twitter_username'):
confidence += 0.15
if total_stars > 100:
confidence += 0.1
confidence = min(confidence, 1.0)
# build reasons
reasons = []
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if aligned_topics:
reasons.append(f"topics: {', '.join(list(aligned_topics)[:5])}")
if builder_score > 0:
reasons.append(f"builder ({len(repos)} repos)")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
# add lost reasons if applicable
if user_type == 'lost' or user_type == 'both':
lost_descriptions = get_signal_descriptions(lost_signals)
if lost_descriptions:
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
# === DEEP HANDLE DISCOVERY ===
# follow blog links, scrape websites, find ALL social handles
handles, discovered_emails = discover_all_handles(user)
# merge discovered emails with github email
all_emails = discovered_emails or []
if user.get('email'):
all_emails.append(user['email'])
all_emails = list(set(e for e in all_emails if e and 'noreply' not in e.lower()))
return {
'platform': 'github',
'username': login,
'url': f"https://github.com/{login}",
'name': user.get('name'),
'bio': user.get('bio'),
'location': user.get('location'),
'score': total_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'topics': list(aligned_topics),
'languages': dict(languages),
'repo_count': len(repos),
'total_stars': total_stars,
'reasons': reasons,
'contact': {
'email': all_emails[0] if all_emails else None,
'emails': all_emails,
'blog': user.get('blog'),
'twitter': user.get('twitter_username') or handles.get('twitter'),
'mastodon': handles.get('mastodon'),
'bluesky': handles.get('bluesky'),
'matrix': handles.get('matrix'),
'lemmy': handles.get('lemmy'),
},
'extra': {
'topics': list(aligned_topics),
'languages': dict(languages),
'repo_count': len(repos),
'total_stars': total_stars,
'hireable': user.get('hireable', False),
'handles': handles, # all discovered handles
},
'hireable': user.get('hireable', False),
'scraped_at': datetime.now().isoformat(),
# lost builder fields
'lost_potential_score': lost_potential_score,
'lost_signals': lost_signals,
'user_type': user_type, # 'builder', 'lost', 'both', 'none'
}
def scrape_github(db, limit_per_source=50):
"""
full github scrape
returns list of analyzed users
"""
print("scoutd/github: starting scrape...")
all_logins = set()
# 1. ecosystem repo contributors
print(" scraping ecosystem repo contributors...")
for repo in ECOSYSTEM_REPOS:
contributors = get_repo_contributors(repo, per_page=limit_per_source)
for c in contributors:
login = c.get('login')
if login and not login.endswith('[bot]'):
all_logins.add(login)
print(f" {repo}: {len(contributors)} contributors")
# 2. topic repos
print(" scraping topic repos...")
for topic in TARGET_TOPICS[:10]:
repos = search_repos_by_topic(topic, per_page=30)
for repo in repos:
owner = repo.get('owner', {}).get('login')
if owner and not owner.endswith('[bot]'):
all_logins.add(owner)
print(f" #{topic}: {len(repos)} repos")
print(f" found {len(all_logins)} unique users to analyze")
# analyze each
results = []
builders_found = 0
lost_found = 0
for i, login in enumerate(all_logins):
if i % 20 == 0:
print(f" analyzing... {i}/{len(all_logins)}")
try:
result = analyze_github_user(login)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
user_type = result.get('user_type', 'none')
if user_type == 'builder':
builders_found += 1
if result['score'] >= 50:
print(f"{login}: {result['score']} pts, {result['confidence']:.0%} conf")
elif user_type == 'lost':
lost_found += 1
lost_score = result.get('lost_potential_score', 0)
if lost_score >= 40:
print(f" 💔 {login}: lost_score={lost_score}, values={result['score']} pts")
elif user_type == 'both':
builders_found += 1
lost_found += 1
print(f"{login}: recovering builder (lost={result.get('lost_potential_score', 0)}, active={result['score']})")
except Exception as e:
print(f" error on {login}: {e}")
print(f"scoutd/github: found {len(results)} aligned humans")
print(f" - {builders_found} active builders")
print(f" - {lost_found} lost builders (need encouragement)")
return results

View file

@ -1,507 +0,0 @@
"""
scoutd/handles.py - comprehensive social handle discovery
finds ALL social handles from:
- github bio/profile
- personal websites (rel="me", footers, contact pages, json-ld)
- README files
- linktree/bio.link/carrd pages
- any linked pages
stores structured handle data for activity-based contact selection
"""
import re
import json
import requests
from urllib.parse import urlparse, urljoin
from bs4 import BeautifulSoup
HEADERS = {'User-Agent': 'Mozilla/5.0 (compatible; connectd/1.0)'}
# platform URL patterns -> (platform, handle_extractor)
PLATFORM_PATTERNS = {
# fediverse
'mastodon': [
(r'https?://([^/]+)/@([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
(r'https?://([^/]+)/users/([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
(r'https?://mastodon\.social/@([^/?#]+)', lambda m: f"@{m.group(1)}@mastodon.social"),
],
'pixelfed': [
(r'https?://pixelfed\.social/@([^/?#]+)', lambda m: f"@{m.group(1)}@pixelfed.social"),
(r'https?://([^/]*pixelfed[^/]*)/@([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
],
'lemmy': [
(r'https?://([^/]+)/u/([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
(r'https?://lemmy\.([^/]+)/u/([^/?#]+)', lambda m: f"@{m.group(2)}@lemmy.{m.group(1)}"),
],
# mainstream
'twitter': [
(r'https?://(?:www\.)?(?:twitter|x)\.com/([^/?#]+)', lambda m: f"@{m.group(1)}"),
],
'bluesky': [
(r'https?://bsky\.app/profile/([^/?#]+)', lambda m: m.group(1)),
(r'https?://([^.]+)\.bsky\.social', lambda m: f"{m.group(1)}.bsky.social"),
],
'threads': [
(r'https?://(?:www\.)?threads\.net/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
],
'instagram': [
(r'https?://(?:www\.)?instagram\.com/([^/?#]+)', lambda m: f"@{m.group(1)}"),
],
'facebook': [
(r'https?://(?:www\.)?facebook\.com/([^/?#]+)', lambda m: m.group(1)),
],
'linkedin': [
(r'https?://(?:www\.)?linkedin\.com/in/([^/?#]+)', lambda m: m.group(1)),
(r'https?://(?:www\.)?linkedin\.com/company/([^/?#]+)', lambda m: f"company/{m.group(1)}"),
],
# dev platforms
'github': [
(r'https?://(?:www\.)?github\.com/([^/?#]+)', lambda m: m.group(1)),
],
'gitlab': [
(r'https?://(?:www\.)?gitlab\.com/([^/?#]+)', lambda m: m.group(1)),
],
'codeberg': [
(r'https?://codeberg\.org/([^/?#]+)', lambda m: m.group(1)),
],
'sourcehut': [
(r'https?://sr\.ht/~([^/?#]+)', lambda m: f"~{m.group(1)}"),
(r'https?://git\.sr\.ht/~([^/?#]+)', lambda m: f"~{m.group(1)}"),
],
# chat
'matrix': [
(r'https?://matrix\.to/#/(@[^:]+:[^/?#]+)', lambda m: m.group(1)),
],
'discord': [
(r'https?://discord\.gg/([^/?#]+)', lambda m: f"invite/{m.group(1)}"),
(r'https?://discord\.com/invite/([^/?#]+)', lambda m: f"invite/{m.group(1)}"),
],
'telegram': [
(r'https?://t\.me/([^/?#]+)', lambda m: f"@{m.group(1)}"),
],
# content
'youtube': [
(r'https?://(?:www\.)?youtube\.com/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
(r'https?://(?:www\.)?youtube\.com/c(?:hannel)?/([^/?#]+)', lambda m: m.group(1)),
],
'twitch': [
(r'https?://(?:www\.)?twitch\.tv/([^/?#]+)', lambda m: m.group(1)),
],
'substack': [
(r'https?://([^.]+)\.substack\.com', lambda m: m.group(1)),
],
'medium': [
(r'https?://(?:www\.)?medium\.com/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
(r'https?://([^.]+)\.medium\.com', lambda m: m.group(1)),
],
'devto': [
(r'https?://dev\.to/([^/?#]+)', lambda m: m.group(1)),
],
# funding
'kofi': [
(r'https?://ko-fi\.com/([^/?#]+)', lambda m: m.group(1)),
],
'patreon': [
(r'https?://(?:www\.)?patreon\.com/([^/?#]+)', lambda m: m.group(1)),
],
'liberapay': [
(r'https?://liberapay\.com/([^/?#]+)', lambda m: m.group(1)),
],
'github_sponsors': [
(r'https?://github\.com/sponsors/([^/?#]+)', lambda m: m.group(1)),
],
# link aggregators (we'll parse these specially)
'linktree': [
(r'https?://linktr\.ee/([^/?#]+)', lambda m: m.group(1)),
],
'biolink': [
(r'https?://bio\.link/([^/?#]+)', lambda m: m.group(1)),
],
'carrd': [
(r'https?://([^.]+)\.carrd\.co', lambda m: m.group(1)),
],
}
# fediverse handle pattern: @user@instance
FEDIVERSE_HANDLE_PATTERN = re.compile(r'@([\w.-]+)@([\w.-]+\.[\w]+)')
# email pattern
EMAIL_PATTERN = re.compile(r'\b([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})\b')
# known fediverse instances (for context-free handle detection)
KNOWN_FEDIVERSE_INSTANCES = [
'mastodon.social', 'mastodon.online', 'mstdn.social', 'mas.to',
'tech.lgbt', 'fosstodon.org', 'hackers.town', 'social.coop',
'kolektiva.social', 'solarpunk.moe', 'wandering.shop',
'elekk.xyz', 'cybre.space', 'octodon.social', 'chaos.social',
'infosec.exchange', 'ruby.social', 'phpc.social', 'toot.cafe',
'mstdn.io', 'pixelfed.social', 'lemmy.ml', 'lemmy.world',
'kbin.social', 'pleroma.site', 'akkoma.dev',
]
def extract_handle_from_url(url):
"""extract platform and handle from a URL"""
for platform, patterns in PLATFORM_PATTERNS.items():
for pattern, extractor in patterns:
match = re.match(pattern, url, re.I)
if match:
return platform, extractor(match)
return None, None
def extract_fediverse_handles(text):
"""find @user@instance.tld patterns in text"""
handles = []
for match in FEDIVERSE_HANDLE_PATTERN.finditer(text):
user, instance = match.groups()
handles.append(f"@{user}@{instance}")
return handles
def extract_emails(text):
"""find email addresses in text"""
emails = []
for match in EMAIL_PATTERN.finditer(text):
email = match.group(1)
# filter out common non-personal emails
if not any(x in email.lower() for x in ['noreply', 'no-reply', 'donotreply', 'example.com']):
emails.append(email)
return emails
def scrape_page(url, timeout=15):
"""fetch and parse a web page"""
try:
resp = requests.get(url, headers=HEADERS, timeout=timeout, allow_redirects=True)
resp.raise_for_status()
return BeautifulSoup(resp.text, 'html.parser'), resp.text
except Exception as e:
return None, None
def extract_rel_me_links(soup):
"""extract rel="me" links (used for verification)"""
links = []
if not soup:
return links
for a in soup.find_all('a', rel=lambda x: x and 'me' in x):
href = a.get('href')
if href:
links.append(href)
return links
def extract_social_links_from_page(soup, base_url=None):
"""extract all social links from a page"""
links = []
if not soup:
return links
# all links
for a in soup.find_all('a', href=True):
href = a['href']
if base_url and not href.startswith('http'):
href = urljoin(base_url, href)
# check if it's a known social platform
platform, handle = extract_handle_from_url(href)
if platform:
links.append({'platform': platform, 'handle': handle, 'url': href})
return links
def extract_json_ld(soup):
"""extract structured data from JSON-LD"""
data = {}
if not soup:
return data
for script in soup.find_all('script', type='application/ld+json'):
try:
ld = json.loads(script.string)
# look for sameAs links (social profiles)
if isinstance(ld, dict):
same_as = ld.get('sameAs', [])
if isinstance(same_as, str):
same_as = [same_as]
for url in same_as:
platform, handle = extract_handle_from_url(url)
if platform:
data[platform] = handle
except:
pass
return data
def scrape_linktree(url):
"""scrape a linktree/bio.link/carrd page for all links"""
handles = {}
soup, raw = scrape_page(url)
if not soup:
return handles
# linktree uses data attributes and JS, but links are often in the HTML
links = extract_social_links_from_page(soup, url)
for link in links:
if link['platform'] not in ['linktree', 'biolink', 'carrd']:
handles[link['platform']] = link['handle']
# also check for fediverse handles in text
if raw:
fedi_handles = extract_fediverse_handles(raw)
if fedi_handles:
handles['mastodon'] = fedi_handles[0]
return handles
def scrape_website_for_handles(url, follow_links=True):
"""
comprehensive website scrape for social handles
checks:
- rel="me" links
- social links in page
- json-ld structured data
- /about and /contact pages
- fediverse handles in text
- emails
"""
handles = {}
emails = []
soup, raw = scrape_page(url)
if not soup:
return handles, emails
# 1. rel="me" links (most authoritative)
rel_me = extract_rel_me_links(soup)
for link in rel_me:
platform, handle = extract_handle_from_url(link)
if platform and platform not in handles:
handles[platform] = handle
# 2. all social links on page
social_links = extract_social_links_from_page(soup, url)
for link in social_links:
if link['platform'] not in handles:
handles[link['platform']] = link['handle']
# 3. json-ld structured data
json_ld = extract_json_ld(soup)
for platform, handle in json_ld.items():
if platform not in handles:
handles[platform] = handle
# 4. fediverse handles in text
if raw:
fedi = extract_fediverse_handles(raw)
if fedi and 'mastodon' not in handles:
handles['mastodon'] = fedi[0]
# emails
emails = extract_emails(raw)
# 5. follow links to /about, /contact
if follow_links:
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
for path in ['/about', '/contact', '/links', '/social']:
try:
sub_soup, sub_raw = scrape_page(base + path)
if sub_soup:
sub_links = extract_social_links_from_page(sub_soup, base)
for link in sub_links:
if link['platform'] not in handles:
handles[link['platform']] = link['handle']
if sub_raw:
fedi = extract_fediverse_handles(sub_raw)
if fedi and 'mastodon' not in handles:
handles['mastodon'] = fedi[0]
emails.extend(extract_emails(sub_raw))
except:
pass
# 6. check for linktree etc in links and follow them
for platform in ['linktree', 'biolink', 'carrd']:
if platform in handles:
# this is actually a link aggregator, scrape it
link_url = None
for link in social_links:
if link['platform'] == platform:
link_url = link['url']
break
if link_url:
aggregator_handles = scrape_linktree(link_url)
for p, h in aggregator_handles.items():
if p not in handles:
handles[p] = h
del handles[platform] # remove the aggregator itself
return handles, list(set(emails))
def extract_handles_from_text(text):
"""extract handles from plain text (bio, README, etc)"""
handles = {}
if not text:
return handles
# fediverse handles
fedi = extract_fediverse_handles(text)
if fedi:
handles['mastodon'] = fedi[0]
# URL patterns in text
url_pattern = re.compile(r'https?://[^\s<>"\']+')
for match in url_pattern.finditer(text):
url = match.group(0).rstrip('.,;:!?)')
platform, handle = extract_handle_from_url(url)
if platform and platform not in handles:
handles[platform] = handle
# twitter-style @mentions (only if looks like twitter context)
if 'twitter' in text.lower() or 'x.com' in text.lower():
twitter_pattern = re.compile(r'(?:^|[^\w])@(\w{1,15})(?:[^\w]|$)')
for match in twitter_pattern.finditer(text):
if 'twitter' not in handles:
handles['twitter'] = f"@{match.group(1)}"
# matrix handles
matrix_pattern = re.compile(r'@([\w.-]+):([\w.-]+)')
for match in matrix_pattern.finditer(text):
if 'matrix' not in handles:
handles['matrix'] = f"@{match.group(1)}:{match.group(2)}"
return handles
def scrape_github_readme(username):
"""scrape user's profile README (username/username repo)"""
handles = {}
emails = []
url = f"https://raw.githubusercontent.com/{username}/{username}/main/README.md"
try:
resp = requests.get(url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
text = resp.text
# extract handles from text
handles = extract_handles_from_text(text)
# extract emails
emails = extract_emails(text)
return handles, emails
except:
pass
# try master branch
url = f"https://raw.githubusercontent.com/{username}/{username}/master/README.md"
try:
resp = requests.get(url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
text = resp.text
handles = extract_handles_from_text(text)
emails = extract_emails(text)
except:
pass
return handles, emails
def discover_all_handles(github_profile):
"""
comprehensive handle discovery from a github profile dict
github_profile should contain:
- username
- bio
- blog (website URL)
- twitter_username
- etc.
"""
handles = {}
emails = []
username = github_profile.get('login') or github_profile.get('username')
print(f" discovering handles for {username}...")
# 1. github bio
bio = github_profile.get('bio', '')
if bio:
bio_handles = extract_handles_from_text(bio)
handles.update(bio_handles)
emails.extend(extract_emails(bio))
# 2. twitter from github profile
twitter = github_profile.get('twitter_username')
if twitter and 'twitter' not in handles:
handles['twitter'] = f"@{twitter}"
# 3. website from github profile
website = github_profile.get('blog')
if website:
if not website.startswith('http'):
website = f"https://{website}"
print(f" scraping website: {website}")
site_handles, site_emails = scrape_website_for_handles(website)
for p, h in site_handles.items():
if p not in handles:
handles[p] = h
emails.extend(site_emails)
# 4. profile README
if username:
print(f" checking profile README...")
readme_handles, readme_emails = scrape_github_readme(username)
for p, h in readme_handles.items():
if p not in handles:
handles[p] = h
emails.extend(readme_emails)
# 5. email from github profile
github_email = github_profile.get('email')
if github_email:
emails.append(github_email)
# dedupe emails
emails = list(set(e for e in emails if e and '@' in e and 'noreply' not in e.lower()))
print(f" found {len(handles)} handles, {len(emails)} emails")
return handles, emails
def merge_handles(existing, new):
"""merge new handles into existing, preferring more specific handles"""
for platform, handle in new.items():
if platform not in existing:
existing[platform] = handle
elif len(handle) > len(existing[platform]):
# prefer longer/more specific handles
existing[platform] = handle
return existing

View file

@ -1,322 +0,0 @@
"""
scoutd/lemmy.py - lemmy (fediverse reddit) discovery
lemmy is federated so we hit multiple instances.
great for finding lost builders in communities like:
- /c/programming, /c/technology, /c/linux
- /c/antiwork, /c/workreform (lost builders!)
- /c/selfhosted, /c/privacy, /c/opensource
supports authenticated access for private instances and DM delivery.
"""
import requests
import json
import time
import os
from datetime import datetime
from pathlib import Path
from .signals import analyze_text
from .lost import (
analyze_social_for_lost_signals,
analyze_text_for_lost_signals,
classify_user,
)
# auth config from environment
LEMMY_INSTANCE = os.environ.get('LEMMY_INSTANCE', '')
LEMMY_USERNAME = os.environ.get('LEMMY_USERNAME', '')
LEMMY_PASSWORD = os.environ.get('LEMMY_PASSWORD', '')
# auth token cache
_auth_token = None
# popular lemmy instances
LEMMY_INSTANCES = [
'lemmy.ml',
'lemmy.world',
'programming.dev',
'lemm.ee',
'sh.itjust.works',
]
# communities to scout (format: community@instance or just community for local)
TARGET_COMMUNITIES = [
# builder communities
'programming',
'selfhosted',
'linux',
'opensource',
'privacy',
'technology',
'webdev',
'rust',
'python',
'golang',
# lost builder communities (people struggling, stuck, seeking)
'antiwork',
'workreform',
'careerguidance',
'cscareerquestions',
'learnprogramming',
'findapath',
]
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'lemmy'
CACHE_DIR.mkdir(parents=True, exist_ok=True)
def get_auth_token(instance=None):
"""get auth token for lemmy instance"""
global _auth_token
if _auth_token:
return _auth_token
instance = instance or LEMMY_INSTANCE
if not all([instance, LEMMY_USERNAME, LEMMY_PASSWORD]):
return None
try:
url = f"https://{instance}/api/v3/user/login"
resp = requests.post(url, json={
'username_or_email': LEMMY_USERNAME,
'password': LEMMY_PASSWORD,
}, timeout=30)
if resp.status_code == 200:
_auth_token = resp.json().get('jwt')
return _auth_token
return None
except Exception as e:
print(f"lemmy auth error: {e}")
return None
def send_lemmy_dm(recipient_username, message, dry_run=False):
"""send a private message via lemmy"""
if not LEMMY_INSTANCE:
return False, "LEMMY_INSTANCE not configured"
if dry_run:
print(f"[dry run] would send lemmy DM to {recipient_username}")
return True, None
token = get_auth_token()
if not token:
return False, "failed to authenticate with lemmy"
try:
# parse recipient - could be username@instance or just username
if '@' in recipient_username:
username, instance = recipient_username.split('@', 1)
else:
username = recipient_username
instance = LEMMY_INSTANCE
# get recipient user id
user_url = f"https://{LEMMY_INSTANCE}/api/v3/user"
resp = requests.get(user_url, params={'username': f"{username}@{instance}"}, timeout=30)
if resp.status_code != 200:
# try without instance suffix for local users
resp = requests.get(user_url, params={'username': username}, timeout=30)
if resp.status_code != 200:
return False, f"could not find user {recipient_username}"
recipient_id = resp.json().get('person_view', {}).get('person', {}).get('id')
if not recipient_id:
return False, "could not get recipient id"
# send DM
dm_url = f"https://{LEMMY_INSTANCE}/api/v3/private_message"
resp = requests.post(dm_url,
headers={'Authorization': f'Bearer {token}'},
json={
'content': message,
'recipient_id': recipient_id,
},
timeout=30
)
if resp.status_code == 200:
return True, None
else:
return False, f"lemmy DM error: {resp.status_code} - {resp.text}"
except Exception as e:
return False, f"lemmy DM error: {str(e)}"
def get_community_posts(instance, community, limit=50, sort='New'):
"""get posts from a lemmy community"""
try:
url = f"https://{instance}/api/v3/post/list"
params = {
'community_name': community,
'sort': sort,
'limit': limit,
}
resp = requests.get(url, params=params, timeout=30)
if resp.status_code == 200:
return resp.json().get('posts', [])
return []
except Exception as e:
return []
def get_user_profile(instance, username):
"""get lemmy user profile"""
try:
url = f"https://{instance}/api/v3/user"
params = {'username': username}
resp = requests.get(url, params=params, timeout=30)
if resp.status_code == 200:
return resp.json()
return None
except Exception:
return None
def analyze_lemmy_user(instance, username, posts=None):
"""analyze a lemmy user for values alignment and lost signals"""
profile = get_user_profile(instance, username)
if not profile:
return None
person = profile.get('person_view', {}).get('person', {})
counts = profile.get('person_view', {}).get('counts', {})
bio = person.get('bio', '') or ''
display_name = person.get('display_name') or person.get('name', username)
# analyze bio
bio_score, bio_signals, bio_reasons = analyze_text(bio)
# analyze posts if provided
post_signals = []
post_text = []
if posts:
for post in posts[:10]:
post_data = post.get('post', {})
title = post_data.get('name', '')
body = post_data.get('body', '')
post_text.append(f"{title} {body}")
_, signals, _ = analyze_text(f"{title} {body}")
post_signals.extend(signals)
all_signals = list(set(bio_signals + post_signals))
total_score = bio_score + len(post_signals) * 5
# lost builder detection
profile_for_lost = {
'bio': bio,
'post_count': counts.get('post_count', 0),
'comment_count': counts.get('comment_count', 0),
}
posts_for_lost = [{'text': t} for t in post_text]
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
lost_potential_score = lost_weight
user_type = classify_user(lost_potential_score, 50, total_score)
return {
'platform': 'lemmy',
'username': f"{username}@{instance}",
'url': f"https://{instance}/u/{username}",
'name': display_name,
'bio': bio,
'location': None,
'score': total_score,
'confidence': min(0.9, 0.3 + len(all_signals) * 0.1),
'signals': all_signals,
'negative_signals': [],
'reasons': bio_reasons,
'contact': {},
'extra': {
'instance': instance,
'post_count': counts.get('post_count', 0),
'comment_count': counts.get('comment_count', 0),
},
'lost_potential_score': lost_potential_score,
'lost_signals': lost_signals,
'user_type': user_type,
}
def scrape_lemmy(db, limit_per_community=30):
"""scrape lemmy instances for aligned builders"""
print("scouting lemmy...")
found = 0
lost_found = 0
seen_users = set()
# build instance list - user's instance first if configured
instances = list(LEMMY_INSTANCES)
if LEMMY_INSTANCE and LEMMY_INSTANCE not in instances:
instances.insert(0, LEMMY_INSTANCE)
for instance in instances:
print(f" instance: {instance}")
for community in TARGET_COMMUNITIES:
posts = get_community_posts(instance, community, limit=limit_per_community)
if not posts:
continue
print(f" /c/{community}: {len(posts)} posts")
# group posts by user
user_posts = {}
for post in posts:
creator = post.get('creator', {})
username = creator.get('name')
if not username:
continue
user_key = f"{username}@{instance}"
if user_key in seen_users:
continue
if user_key not in user_posts:
user_posts[user_key] = []
user_posts[user_key].append(post)
# analyze each user
for user_key, posts in user_posts.items():
username = user_key.split('@')[0]
if user_key in seen_users:
continue
seen_users.add(user_key)
result = analyze_lemmy_user(instance, username, posts)
if not result:
continue
if result['score'] >= 20 or result.get('lost_potential_score', 0) >= 30:
db.save_human(result)
found += 1
if result.get('user_type') in ['lost', 'both']:
lost_found += 1
print(f" {result['username']}: {result['score']:.0f} (lost: {result['lost_potential_score']:.0f})")
elif result['score'] >= 40:
print(f" {result['username']}: {result['score']:.0f}")
time.sleep(0.5) # rate limit
time.sleep(1) # between communities
time.sleep(2) # between instances
print(f"lemmy: found {found} humans ({lost_found} lost builders)")
return found

View file

@ -1,169 +0,0 @@
"""
scoutd/lobsters.py - lobste.rs discovery
high-signal invite-only tech community
"""
import requests
import json
import time
from datetime import datetime
from pathlib import Path
from .signals import analyze_text
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'lobsters'
ALIGNED_TAGS = ['privacy', 'security', 'distributed', 'rust', 'linux', 'culture', 'practices']
def _api_get(url, params=None):
"""rate-limited request"""
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
time.sleep(2)
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" lobsters api error: {e}")
return None
def get_stories_by_tag(tag):
"""get recent stories by tag"""
url = f'https://lobste.rs/t/{tag}.json'
return _api_get(url) or []
def get_newest_stories():
"""get newest stories"""
return _api_get('https://lobste.rs/newest.json') or []
def get_user(username):
"""get user profile"""
return _api_get(f'https://lobste.rs/u/{username}.json')
def analyze_lobsters_user(username):
"""analyze a lobste.rs user"""
user = get_user(username)
if not user:
return None
text_parts = []
if user.get('about'):
text_parts.append(user['about'])
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# lobsters base bonus (invite-only, high signal)
base_score = 15
# karma bonus
karma = user.get('karma', 0)
karma_score = 0
if karma > 100:
karma_score = 10
elif karma > 50:
karma_score = 5
# github presence
github_score = 5 if user.get('github_username') else 0
# homepage
homepage_score = 5 if user.get('homepage') else 0
total_score = text_score + base_score + karma_score + github_score + homepage_score
# confidence
confidence = 0.4 # higher base for invite-only
if text_parts:
confidence += 0.2
if karma > 50:
confidence += 0.2
confidence = min(confidence, 0.9)
reasons = ['on lobste.rs (invite-only)']
if karma > 50:
reasons.append(f"active ({karma} karma)")
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
return {
'platform': 'lobsters',
'username': username,
'url': f"https://lobste.rs/u/{username}",
'score': total_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'karma': karma,
'reasons': reasons,
'contact': {
'github': user.get('github_username'),
'twitter': user.get('twitter_username'),
'homepage': user.get('homepage'),
},
'scraped_at': datetime.now().isoformat(),
}
def scrape_lobsters(db):
"""full lobste.rs scrape"""
print("scoutd/lobsters: starting scrape...")
all_users = set()
# stories by aligned tags
for tag in ALIGNED_TAGS:
print(f" tag: {tag}...")
stories = get_stories_by_tag(tag)
for story in stories:
submitter = story.get('submitter_user', {}).get('username')
if submitter:
all_users.add(submitter)
# newest stories
print(" newest stories...")
for story in get_newest_stories():
submitter = story.get('submitter_user', {}).get('username')
if submitter:
all_users.add(submitter)
print(f" {len(all_users)} unique users to analyze")
# analyze
results = []
for username in all_users:
try:
result = analyze_lobsters_user(username)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
if result['score'] >= 30:
print(f"{username}: {result['score']} pts")
except Exception as e:
print(f" error on {username}: {e}")
print(f"scoutd/lobsters: found {len(results)} aligned humans")
return results

View file

@ -1,491 +0,0 @@
"""
scoutd/lost.py - lost builder detection
finds people with potential who haven't found it yet, gave up, or are too beaten down to try.
these aren't failures. they're seeds that never got water.
detection signals:
- github: forked but never modified, starred many but built nothing, learning repos abandoned
- reddit/forums: "i wish i could...", stuck asking beginner questions for years, helping others but never sharing
- social: retoots builders but never posts own work, imposter syndrome language, isolation signals
- profiles: bio says what they WANT to be, "aspiring" for 2+ years, empty portfolios
the goal isn't to recruit them. it's to show them the door exists.
"""
import re
from datetime import datetime, timedelta
from collections import defaultdict
# signal definitions with weights
LOST_SIGNALS = {
# github signals
'forked_never_modified': {
'weight': 15,
'category': 'github',
'description': 'forked repos but never pushed changes',
},
'starred_many_built_nothing': {
'weight': 20,
'category': 'github',
'description': 'starred 50+ repos but has 0-2 own repos',
},
'account_no_repos': {
'weight': 10,
'category': 'github',
'description': 'account exists but no public repos',
},
'inactivity_bursts': {
'weight': 15,
'category': 'github',
'description': 'long gaps then brief activity bursts',
},
'only_issues_comments': {
'weight': 12,
'category': 'github',
'description': 'only activity is issues/comments on others work',
},
'abandoned_learning_repos': {
'weight': 18,
'category': 'github',
'description': 'learning/tutorial repos that were never finished',
},
'readme_only_repos': {
'weight': 10,
'category': 'github',
'description': 'repos with just README, no actual code',
},
# language signals (from posts/comments/bio)
'wish_i_could': {
'weight': 12,
'category': 'language',
'description': '"i wish i could..." language',
'patterns': [
r'i wish i could',
r'i wish i knew how',
r'wish i had the (time|energy|motivation|skills?)',
],
},
'someday_want': {
'weight': 10,
'category': 'language',
'description': '"someday i want to..." language',
'patterns': [
r'someday i (want|hope|plan) to',
r'one day i\'ll',
r'eventually i\'ll',
r'when i have time i\'ll',
],
},
'stuck_beginner': {
'weight': 20,
'category': 'language',
'description': 'asking beginner questions for years',
'patterns': [
r'still (trying|learning|struggling) (to|with)',
r'can\'t seem to (get|understand|figure)',
r'been trying for (months|years)',
],
},
'self_deprecating': {
'weight': 15,
'category': 'language',
'description': 'self-deprecating about abilities',
'patterns': [
r'i\'m (not smart|too dumb|not good) enough',
r'i (suck|am terrible) at',
r'i\'ll never be able to',
r'people like me (can\'t|don\'t)',
r'i\'m just not (a|the) (type|kind)',
],
},
'no_energy': {
'weight': 18,
'category': 'language',
'description': '"how do people have energy" posts',
'patterns': [
r'how do (people|you|they) have (the )?(energy|time|motivation)',
r'where do (people|you|they) find (the )?(energy|motivation)',
r'i\'m (always|constantly) (tired|exhausted|drained)',
r'no (energy|motivation) (left|anymore)',
],
},
'imposter_syndrome': {
'weight': 15,
'category': 'language',
'description': 'imposter syndrome language',
'patterns': [
r'imposter syndrome',
r'feel like (a |an )?(fraud|fake|imposter)',
r'don\'t (belong|deserve)',
r'everyone else (seems|is) (so much )?(better|smarter)',
r'they\'ll (find out|realize) i\'m',
],
},
'should_really': {
'weight': 8,
'category': 'language',
'description': '"i should really..." posts',
'patterns': [
r'i (should|need to) really',
r'i keep (meaning|wanting) to',
r'i\'ve been (meaning|wanting) to',
],
},
'isolation_signals': {
'weight': 20,
'category': 'language',
'description': 'isolation/loneliness language',
'patterns': [
r'no one (understands|gets it|to talk to)',
r'(feel|feeling) (so )?(alone|isolated|lonely)',
r'don\'t have anyone (to|who)',
r'wish i (had|knew) (someone|people)',
],
},
'enthusiasm_for_others': {
'weight': 10,
'category': 'behavior',
'description': 'celebrates others but dismissive of self',
},
# subreddit/community signals
'stuck_communities': {
'weight': 15,
'category': 'community',
'description': 'active in stuck/struggling communities',
'subreddits': [
'learnprogramming',
'findapath',
'getdisciplined',
'getmotivated',
'decidingtobebetter',
'selfimprovement',
'adhd',
'depression',
'anxiety',
],
},
# profile signals
'aspirational_bio': {
'weight': 12,
'category': 'profile',
'description': 'bio says what they WANT to be',
'patterns': [
r'aspiring',
r'future',
r'want(ing)? to (be|become)',
r'learning to',
r'trying to (become|be|learn)',
r'hoping to',
],
},
'empty_portfolio': {
'weight': 15,
'category': 'profile',
'description': 'links to empty portfolio sites',
},
'long_aspiring': {
'weight': 20,
'category': 'profile',
'description': '"aspiring" in bio for 2+ years',
},
}
# subreddits that indicate someone might be stuck
STUCK_SUBREDDITS = {
'learnprogramming': 8,
'findapath': 15,
'getdisciplined': 12,
'getmotivated': 10,
'decidingtobebetter': 12,
'selfimprovement': 8,
'adhd': 10,
'depression': 15,
'anxiety': 12,
'socialanxiety': 12,
'neet': 20,
'lostgeneration': 15,
'antiwork': 5, # could be aligned OR stuck
'careerguidance': 8,
'cscareerquestions': 5,
}
def analyze_text_for_lost_signals(text):
"""analyze text for lost builder language patterns"""
if not text:
return [], 0
text_lower = text.lower()
signals_found = []
total_weight = 0
for signal_name, signal_data in LOST_SIGNALS.items():
if 'patterns' not in signal_data:
continue
for pattern in signal_data['patterns']:
if re.search(pattern, text_lower):
signals_found.append(signal_name)
total_weight += signal_data['weight']
break # only count each signal once
return signals_found, total_weight
def analyze_github_for_lost_signals(profile):
"""analyze github profile for lost builder signals"""
signals_found = []
total_weight = 0
if not profile:
return signals_found, total_weight
repos = profile.get('repos', []) or profile.get('top_repos', [])
extra = profile.get('extra', {})
public_repos = profile.get('public_repos', len(repos))
followers = profile.get('followers', 0)
following = profile.get('following', 0)
# starred many but built nothing
# (we'd need to fetch starred count separately, approximate with following ratio)
if public_repos <= 2 and following > 50:
signals_found.append('starred_many_built_nothing')
total_weight += LOST_SIGNALS['starred_many_built_nothing']['weight']
# account but no repos
if public_repos == 0:
signals_found.append('account_no_repos')
total_weight += LOST_SIGNALS['account_no_repos']['weight']
# check repos for signals
forked_count = 0
forked_modified = 0
learning_repos = 0
readme_only = 0
learning_keywords = ['learning', 'tutorial', 'course', 'practice', 'exercise',
'bootcamp', 'udemy', 'freecodecamp', 'odin', 'codecademy']
for repo in repos:
name = (repo.get('name') or '').lower()
description = (repo.get('description') or '').lower()
language = repo.get('language')
is_fork = repo.get('fork', False)
# forked but never modified
if is_fork:
forked_count += 1
# if pushed_at is close to created_at, never modified
# (simplified: just count forks for now)
# learning/tutorial repos
if any(kw in name or kw in description for kw in learning_keywords):
learning_repos += 1
# readme only (no language detected usually means no code)
if not language and not is_fork:
readme_only += 1
if forked_count >= 5 and public_repos - forked_count <= 2:
signals_found.append('forked_never_modified')
total_weight += LOST_SIGNALS['forked_never_modified']['weight']
if learning_repos >= 3:
signals_found.append('abandoned_learning_repos')
total_weight += LOST_SIGNALS['abandoned_learning_repos']['weight']
if readme_only >= 2:
signals_found.append('readme_only_repos')
total_weight += LOST_SIGNALS['readme_only_repos']['weight']
# check bio for lost signals
bio = profile.get('bio') or ''
bio_signals, bio_weight = analyze_text_for_lost_signals(bio)
signals_found.extend(bio_signals)
total_weight += bio_weight
# aspirational bio check
bio_lower = bio.lower()
if any(re.search(p, bio_lower) for p in LOST_SIGNALS['aspirational_bio']['patterns']):
if 'aspirational_bio' not in signals_found:
signals_found.append('aspirational_bio')
total_weight += LOST_SIGNALS['aspirational_bio']['weight']
return signals_found, total_weight
def analyze_reddit_for_lost_signals(activity, subreddits):
"""analyze reddit activity for lost builder signals"""
signals_found = []
total_weight = 0
# check subreddit activity
stuck_sub_activity = 0
for sub in subreddits:
if sub.lower() in STUCK_SUBREDDITS:
stuck_sub_activity += STUCK_SUBREDDITS[sub.lower()]
if stuck_sub_activity >= 20:
signals_found.append('stuck_communities')
total_weight += min(stuck_sub_activity, 30) # cap at 30
# analyze post/comment text
all_text = []
for item in activity:
if item.get('title'):
all_text.append(item['title'])
if item.get('body'):
all_text.append(item['body'])
combined_text = ' '.join(all_text)
text_signals, text_weight = analyze_text_for_lost_signals(combined_text)
signals_found.extend(text_signals)
total_weight += text_weight
# check for helping others but never sharing own work
help_count = 0
share_count = 0
for item in activity:
body = (item.get('body') or '').lower()
title = (item.get('title') or '').lower()
# helping patterns
if any(p in body for p in ['try this', 'you could', 'have you tried', 'i recommend']):
help_count += 1
# sharing patterns
if any(p in body + title for p in ['i built', 'i made', 'my project', 'check out my', 'i created']):
share_count += 1
if help_count >= 5 and share_count == 0:
signals_found.append('enthusiasm_for_others')
total_weight += LOST_SIGNALS['enthusiasm_for_others']['weight']
return signals_found, total_weight
def analyze_social_for_lost_signals(profile, posts):
"""analyze mastodon/social for lost builder signals"""
signals_found = []
total_weight = 0
# check bio
bio = profile.get('bio') or profile.get('note') or ''
bio_signals, bio_weight = analyze_text_for_lost_signals(bio)
signals_found.extend(bio_signals)
total_weight += bio_weight
# check posts
boost_count = 0
original_count = 0
own_work_count = 0
for post in posts:
content = (post.get('content') or '').lower()
is_boost = post.get('reblog') is not None or post.get('repost')
if is_boost:
boost_count += 1
else:
original_count += 1
# check if sharing own work
if any(p in content for p in ['i built', 'i made', 'my project', 'working on', 'just shipped']):
own_work_count += 1
# analyze text
text_signals, text_weight = analyze_text_for_lost_signals(content)
for sig in text_signals:
if sig not in signals_found:
signals_found.append(sig)
total_weight += LOST_SIGNALS[sig]['weight']
# boosts builders but never posts own work
if boost_count >= 10 and own_work_count == 0:
signals_found.append('enthusiasm_for_others')
total_weight += LOST_SIGNALS['enthusiasm_for_others']['weight']
return signals_found, total_weight
def calculate_lost_potential_score(signals_found):
"""calculate overall lost potential score from signals"""
total = 0
for signal in signals_found:
if signal in LOST_SIGNALS:
total += LOST_SIGNALS[signal]['weight']
return total
def classify_user(lost_score, builder_score, values_score):
"""
classify user as builder, lost, or neither
returns: 'builder' | 'lost' | 'both' | 'none'
"""
# high builder score = active builder
if builder_score >= 50 and lost_score < 30:
return 'builder'
# high lost score + values alignment = lost builder (priority outreach)
if lost_score >= 40 and values_score >= 20:
return 'lost'
# both signals = complex case, might be recovering
if lost_score >= 30 and builder_score >= 30:
return 'both'
return 'none'
def get_signal_descriptions(signals_found):
"""get human-readable descriptions of detected signals"""
descriptions = []
for signal in signals_found:
if signal in LOST_SIGNALS:
descriptions.append(LOST_SIGNALS[signal]['description'])
return descriptions
def should_outreach_lost(user_data, config=None):
"""
determine if we should reach out to a lost builder
considers:
- lost_potential_score threshold
- values alignment
- cooldown period
- manual review requirement
"""
config = config or {}
lost_score = user_data.get('lost_potential_score', 0)
values_score = user_data.get('score', 0) # regular alignment score
# minimum thresholds
min_lost = config.get('min_lost_score', 40)
min_values = config.get('min_values_score', 20)
if lost_score < min_lost:
return False, 'lost_score too low'
if values_score < min_values:
return False, 'values_score too low'
# check cooldown
last_outreach = user_data.get('last_lost_outreach')
if last_outreach:
cooldown_days = config.get('cooldown_days', 90)
last_dt = datetime.fromisoformat(last_outreach)
if datetime.now() - last_dt < timedelta(days=cooldown_days):
return False, f'cooldown active (90 days)'
# always require manual review for lost outreach
return True, 'requires_review'

View file

@ -1,290 +0,0 @@
"""
scoutd/mastodon.py - fediverse discovery
scrapes high-signal instances: tech.lgbt, social.coop, fosstodon, hackers.town
also detects lost builders - social isolation, imposter syndrome, struggling folks
"""
import requests
import json
import time
import re
from datetime import datetime
from pathlib import Path
from .signals import analyze_text, ALIGNED_INSTANCES
from .lost import (
analyze_social_for_lost_signals,
analyze_text_for_lost_signals,
classify_user,
get_signal_descriptions,
)
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'mastodon'
TARGET_HASHTAGS = [
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
'privacy', 'solarpunk', 'cooperative', 'cohousing', 'mutualaid',
'intentionalcommunity', 'degoogle', 'fediverse', 'indieweb',
]
def _api_get(url, params=None):
"""rate-limited request"""
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
time.sleep(1)
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" mastodon api error: {e}")
return None
def strip_html(text):
"""strip html tags"""
return re.sub(r'<[^>]+>', ' ', text) if text else ''
def get_instance_directory(instance, limit=40):
"""get users from instance directory"""
url = f'https://{instance}/api/v1/directory'
return _api_get(url, {'limit': limit, 'local': 'true'}) or []
def get_hashtag_timeline(instance, hashtag, limit=40):
"""get posts from hashtag"""
url = f'https://{instance}/api/v1/timelines/tag/{hashtag}'
return _api_get(url, {'limit': limit}) or []
def get_user_statuses(instance, user_id, limit=30):
"""get user's recent posts"""
url = f'https://{instance}/api/v1/accounts/{user_id}/statuses'
return _api_get(url, {'limit': limit, 'exclude_reblogs': 'true'}) or []
def analyze_mastodon_user(account, instance):
"""analyze a mastodon account"""
acct = account.get('acct', '')
if '@' not in acct:
acct = f"{acct}@{instance}"
# collect text
text_parts = []
bio = strip_html(account.get('note', ''))
if bio:
text_parts.append(bio)
display_name = account.get('display_name', '')
if display_name:
text_parts.append(display_name)
# profile fields
for field in account.get('fields', []):
if field.get('name'):
text_parts.append(field['name'])
if field.get('value'):
text_parts.append(strip_html(field['value']))
# get recent posts
user_id = account.get('id')
if user_id:
statuses = get_user_statuses(instance, user_id)
for status in statuses:
content = strip_html(status.get('content', ''))
if content:
text_parts.append(content)
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# instance bonus
instance_bonus = ALIGNED_INSTANCES.get(instance, 0)
total_score = text_score + instance_bonus
# pronouns bonus
if re.search(r'\b(they/them|she/her|he/him|xe/xem)\b', full_text, re.I):
total_score += 10
positive_signals.append('pronouns')
# activity level
statuses_count = account.get('statuses_count', 0)
followers = account.get('followers_count', 0)
if statuses_count > 100:
total_score += 5
# === LOST BUILDER DETECTION ===
# build profile and posts for lost analysis
profile_for_lost = {
'bio': bio,
'note': account.get('note'),
}
# convert statuses to posts format for analyze_social_for_lost_signals
posts_for_lost = []
if user_id:
statuses = get_user_statuses(instance, user_id)
for status in statuses:
posts_for_lost.append({
'content': strip_html(status.get('content', '')),
'reblog': status.get('reblog'),
})
# analyze for lost signals
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
# also check combined text for lost patterns
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
for sig in text_lost_signals:
if sig not in lost_signals:
lost_signals.append(sig)
lost_weight += text_lost_weight
lost_potential_score = lost_weight
# classify: builder, lost, both, or none
# for mastodon, we use statuses_count as a proxy for builder activity
builder_activity = 10 if statuses_count > 100 else 5 if statuses_count > 50 else 0
user_type = classify_user(lost_potential_score, builder_activity, total_score)
# confidence
confidence = 0.3
if len(text_parts) > 5:
confidence += 0.2
if statuses_count > 50:
confidence += 0.2
if len(positive_signals) > 3:
confidence += 0.2
confidence = min(confidence, 0.9)
reasons = []
if instance in ALIGNED_INSTANCES:
reasons.append(f"on {instance}")
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
# add lost reasons if applicable
if user_type == 'lost' or user_type == 'both':
lost_descriptions = get_signal_descriptions(lost_signals)
if lost_descriptions:
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
return {
'platform': 'mastodon',
'username': acct,
'url': account.get('url'),
'name': display_name,
'bio': bio,
'instance': instance,
'score': total_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'statuses_count': statuses_count,
'followers': followers,
'reasons': reasons,
'scraped_at': datetime.now().isoformat(),
# lost builder fields
'lost_potential_score': lost_potential_score,
'lost_signals': lost_signals,
'user_type': user_type,
}
def scrape_mastodon(db, limit_per_instance=40):
"""full mastodon scrape"""
print("scoutd/mastodon: starting scrape...")
all_accounts = []
# 1. instance directories
print(" scraping instance directories...")
for instance in ALIGNED_INSTANCES:
accounts = get_instance_directory(instance, limit=limit_per_instance)
for acct in accounts:
acct['_instance'] = instance
all_accounts.append(acct)
print(f" {instance}: {len(accounts)} users")
# 2. hashtag timelines
print(" scraping hashtags...")
seen = set()
for tag in TARGET_HASHTAGS[:8]:
for instance in ['fosstodon.org', 'tech.lgbt', 'social.coop']:
posts = get_hashtag_timeline(instance, tag, limit=20)
for post in posts:
account = post.get('account', {})
acct = account.get('acct', '')
if '@' not in acct:
acct = f"{acct}@{instance}"
if acct not in seen:
seen.add(acct)
account['_instance'] = instance
all_accounts.append(account)
# dedupe
unique = {}
for acct in all_accounts:
key = acct.get('acct', acct.get('id', ''))
if key not in unique:
unique[key] = acct
print(f" {len(unique)} unique accounts to analyze")
# analyze
results = []
builders_found = 0
lost_found = 0
for acct_data in unique.values():
instance = acct_data.get('_instance', 'mastodon.social')
try:
result = analyze_mastodon_user(acct_data, instance)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
user_type = result.get('user_type', 'none')
if user_type == 'builder':
builders_found += 1
if result['score'] >= 40:
print(f" ★ @{result['username']}: {result['score']} pts")
elif user_type == 'lost':
lost_found += 1
lost_score = result.get('lost_potential_score', 0)
if lost_score >= 40:
print(f" 💔 @{result['username']}: lost_score={lost_score}, values={result['score']} pts")
elif user_type == 'both':
builders_found += 1
lost_found += 1
print(f" ⚡ @{result['username']}: recovering builder")
except Exception as e:
print(f" error: {e}")
print(f"scoutd/mastodon: found {len(results)} aligned humans")
print(f" - {builders_found} active builders")
print(f" - {lost_found} lost builders (need encouragement)")
return results

View file

@ -1,196 +0,0 @@
"""
scoutd/matrix.py - matrix room membership discovery
finds users in multiple aligned public rooms
"""
import requests
import json
import time
from datetime import datetime
from pathlib import Path
from collections import defaultdict
from .signals import analyze_text
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'matrix'
# public matrix rooms to check membership
ALIGNED_ROOMS = [
'#homeassistant:matrix.org',
'#esphome:matrix.org',
'#selfhosted:matrix.org',
'#privacy:matrix.org',
'#solarpunk:matrix.org',
'#cooperative:matrix.org',
'#foss:matrix.org',
'#linux:matrix.org',
]
# homeservers to query
HOMESERVERS = [
'matrix.org',
'matrix.envs.net',
'tchncs.de',
]
def _api_get(url, params=None):
"""rate-limited request"""
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
time.sleep(1)
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
# matrix apis often fail, don't spam errors
return None
def get_room_members(homeserver, room_alias):
"""
get members of a public room
note: most matrix servers don't expose this publicly
this is a best-effort scrape
"""
# resolve room alias to id first
try:
alias_url = f'https://{homeserver}/_matrix/client/r0/directory/room/{room_alias}'
alias_data = _api_get(alias_url)
if not alias_data or 'room_id' not in alias_data:
return []
room_id = alias_data['room_id']
# try to get members (usually requires auth)
members_url = f'https://{homeserver}/_matrix/client/r0/rooms/{room_id}/members'
members_data = _api_get(members_url)
if members_data and 'chunk' in members_data:
members = []
for event in members_data['chunk']:
if event.get('type') == 'm.room.member' and event.get('content', {}).get('membership') == 'join':
user_id = event.get('state_key')
display_name = event.get('content', {}).get('displayname')
if user_id:
members.append({'user_id': user_id, 'display_name': display_name})
return members
except:
pass
return []
def get_public_rooms(homeserver, limit=100):
"""get public rooms directory"""
url = f'https://{homeserver}/_matrix/client/r0/publicRooms'
data = _api_get(url, {'limit': limit})
return data.get('chunk', []) if data else []
def analyze_matrix_user(user_id, rooms_joined, display_name=None):
"""analyze a matrix user based on room membership"""
# score based on room membership overlap
room_score = len(rooms_joined) * 10
# multi-room bonus
if len(rooms_joined) >= 4:
room_score += 20
elif len(rooms_joined) >= 2:
room_score += 10
# analyze display name if available
text_score = 0
signals = []
if display_name:
text_score, signals, _ = analyze_text(display_name)
total_score = room_score + text_score
confidence = 0.3
if len(rooms_joined) >= 3:
confidence += 0.3
if display_name:
confidence += 0.1
confidence = min(confidence, 0.8)
reasons = [f"in {len(rooms_joined)} aligned rooms: {', '.join(rooms_joined[:3])}"]
if signals:
reasons.append(f"signals: {', '.join(signals[:3])}")
return {
'platform': 'matrix',
'username': user_id,
'url': f"https://matrix.to/#/{user_id}",
'name': display_name,
'score': total_score,
'confidence': confidence,
'signals': signals,
'rooms': rooms_joined,
'reasons': reasons,
'scraped_at': datetime.now().isoformat(),
}
def scrape_matrix(db):
"""
matrix scrape - limited due to auth requirements
best effort on public room data
"""
print("scoutd/matrix: starting scrape (limited - most apis require auth)...")
user_rooms = defaultdict(list)
# try to get public room directories
for homeserver in HOMESERVERS:
print(f" checking {homeserver} public rooms...")
rooms = get_public_rooms(homeserver, limit=50)
for room in rooms:
room_alias = room.get('canonical_alias', '')
# check if it matches any aligned room patterns
aligned_keywords = ['homeassistant', 'selfhosted', 'privacy', 'linux', 'foss', 'cooperative']
if any(kw in room_alias.lower() or kw in room.get('name', '').lower() for kw in aligned_keywords):
print(f" found aligned room: {room_alias or room.get('name')}")
# try to get members from aligned rooms (usually fails without auth)
for room_alias in ALIGNED_ROOMS[:3]: # limit attempts
for homeserver in HOMESERVERS[:1]: # just try matrix.org
members = get_room_members(homeserver, room_alias)
if members:
print(f" {room_alias}: {len(members)} members")
for member in members:
user_rooms[member['user_id']].append(room_alias)
# filter for multi-room users
multi_room = {u: rooms for u, rooms in user_rooms.items() if len(rooms) >= 2}
print(f" {len(multi_room)} users in 2+ aligned rooms")
# analyze
results = []
for user_id, rooms in multi_room.items():
try:
result = analyze_matrix_user(user_id, rooms)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
except Exception as e:
print(f" error: {e}")
print(f"scoutd/matrix: found {len(results)} aligned humans (limited by auth)")
return results

View file

@ -1,503 +0,0 @@
"""
scoutd/reddit.py - reddit discovery (DISCOVERY ONLY, NOT OUTREACH)
reddit is a SIGNAL SOURCE, not a contact channel.
flow:
1. scrape reddit for users active in target subs
2. extract their reddit profile
3. look for links TO other platforms (github, mastodon, website, etc.)
4. add to scout database with reddit as signal source
5. reach out via their OTHER platforms, never reddit
if reddit user has no external links:
- add to manual_queue with note "reddit-only, needs manual review"
also detects lost builders - stuck in learnprogramming for years, imposter syndrome, etc.
"""
import requests
import json
import time
import re
from datetime import datetime
from pathlib import Path
from collections import defaultdict
from .signals import analyze_text, ALIGNED_SUBREDDITS, NEGATIVE_SUBREDDITS
from .lost import (
analyze_reddit_for_lost_signals,
analyze_text_for_lost_signals,
classify_user,
get_signal_descriptions,
STUCK_SUBREDDITS,
)
HEADERS = {'User-Agent': 'connectd:v1.0 (community discovery)'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'reddit'
# patterns for extracting external platform links
PLATFORM_PATTERNS = {
'github': [
r'github\.com/([a-zA-Z0-9_-]+)',
r'gh:\s*@?([a-zA-Z0-9_-]+)',
],
'mastodon': [
r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})',
r'mastodon\.social/@([a-zA-Z0-9_]+)',
r'fosstodon\.org/@([a-zA-Z0-9_]+)',
r'hachyderm\.io/@([a-zA-Z0-9_]+)',
r'tech\.lgbt/@([a-zA-Z0-9_]+)',
],
'twitter': [
r'twitter\.com/([a-zA-Z0-9_]+)',
r'x\.com/([a-zA-Z0-9_]+)',
r'(?:^|\s)@([a-zA-Z0-9_]{1,15})(?:\s|$)', # bare @handle
],
'bluesky': [
r'bsky\.app/profile/([a-zA-Z0-9_.-]+)',
r'([a-zA-Z0-9_-]+)\.bsky\.social',
],
'website': [
r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)',
],
'matrix': [
r'@([a-zA-Z0-9_-]+):([a-zA-Z0-9.-]+)',
],
}
def _api_get(url, params=None):
"""rate-limited request"""
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_data')
except:
pass
time.sleep(2) # reddit rate limit
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" reddit api error: {e}")
return None
def extract_external_links(text):
"""extract links to other platforms from text"""
links = {}
if not text:
return links
for platform, patterns in PLATFORM_PATTERNS.items():
for pattern in patterns:
matches = re.findall(pattern, text, re.IGNORECASE)
if matches:
if platform == 'mastodon' and isinstance(matches[0], tuple):
# full fediverse handle
links[platform] = f"@{matches[0][0]}@{matches[0][1]}"
elif platform == 'matrix' and isinstance(matches[0], tuple):
links[platform] = f"@{matches[0][0]}:{matches[0][1]}"
elif platform == 'website':
# skip reddit/imgur/etc
for match in matches:
if not any(x in match.lower() for x in ['reddit', 'imgur', 'redd.it', 'i.redd']):
links[platform] = f"https://{match}"
break
else:
links[platform] = matches[0]
break
return links
def get_user_profile(username):
"""get user profile including bio/description"""
url = f'https://www.reddit.com/user/{username}/about.json'
data = _api_get(url)
if not data or 'data' not in data:
return None
profile = data['data']
return {
'username': username,
'name': profile.get('name'),
'bio': profile.get('subreddit', {}).get('public_description', ''),
'title': profile.get('subreddit', {}).get('title', ''),
'icon': profile.get('icon_img'),
'created_utc': profile.get('created_utc'),
'total_karma': profile.get('total_karma', 0),
'link_karma': profile.get('link_karma', 0),
'comment_karma': profile.get('comment_karma', 0),
}
def get_subreddit_users(subreddit, limit=100):
"""get recent posters/commenters from a subreddit"""
users = set()
# posts
url = f'https://www.reddit.com/r/{subreddit}/new.json'
data = _api_get(url, {'limit': limit})
if data and 'data' in data:
for post in data['data'].get('children', []):
author = post['data'].get('author')
if author and author not in ['[deleted]', 'AutoModerator']:
users.add(author)
# comments
url = f'https://www.reddit.com/r/{subreddit}/comments.json'
data = _api_get(url, {'limit': limit})
if data and 'data' in data:
for comment in data['data'].get('children', []):
author = comment['data'].get('author')
if author and author not in ['[deleted]', 'AutoModerator']:
users.add(author)
return users
def get_user_activity(username):
"""get user's posts and comments"""
activity = []
# posts
url = f'https://www.reddit.com/user/{username}/submitted.json'
data = _api_get(url, {'limit': 100})
if data and 'data' in data:
for post in data['data'].get('children', []):
activity.append({
'type': 'post',
'subreddit': post['data'].get('subreddit'),
'title': post['data'].get('title', ''),
'body': post['data'].get('selftext', ''),
'score': post['data'].get('score', 0),
})
# comments
url = f'https://www.reddit.com/user/{username}/comments.json'
data = _api_get(url, {'limit': 100})
if data and 'data' in data:
for comment in data['data'].get('children', []):
activity.append({
'type': 'comment',
'subreddit': comment['data'].get('subreddit'),
'body': comment['data'].get('body', ''),
'score': comment['data'].get('score', 0),
})
return activity
def analyze_reddit_user(username):
"""
analyze a reddit user for alignment and extract external platform links.
reddit is DISCOVERY ONLY - we find users here but contact them elsewhere.
"""
activity = get_user_activity(username)
if not activity:
return None
# get profile for bio
profile = get_user_profile(username)
# count subreddit activity
sub_activity = defaultdict(int)
text_parts = []
total_karma = 0
for item in activity:
sub = item.get('subreddit', '').lower()
if sub:
sub_activity[sub] += 1
if item.get('title'):
text_parts.append(item['title'])
if item.get('body'):
text_parts.append(item['body'])
total_karma += item.get('score', 0)
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# EXTRACT EXTERNAL LINKS - this is the key part
# check profile bio first
external_links = {}
if profile:
bio_text = f"{profile.get('bio', '')} {profile.get('title', '')}"
external_links.update(extract_external_links(bio_text))
# also scan posts/comments for links (people often share their github etc)
activity_links = extract_external_links(full_text)
for platform, link in activity_links.items():
if platform not in external_links:
external_links[platform] = link
# subreddit scoring
sub_score = 0
aligned_subs = []
for sub, count in sub_activity.items():
weight = ALIGNED_SUBREDDITS.get(sub, 0)
if weight > 0:
sub_score += weight * min(count, 5)
aligned_subs.append(sub)
# multi-sub bonus
if len(aligned_subs) >= 5:
sub_score += 30
elif len(aligned_subs) >= 3:
sub_score += 15
# negative sub penalty
for sub in sub_activity:
if sub.lower() in [n.lower() for n in NEGATIVE_SUBREDDITS]:
sub_score -= 50
negative_signals.append(f"r/{sub}")
total_score = text_score + sub_score
# bonus if they have external links (we can actually contact them)
if external_links.get('github'):
total_score += 10
positive_signals.append('has github')
if external_links.get('mastodon'):
total_score += 10
positive_signals.append('has mastodon')
if external_links.get('website'):
total_score += 5
positive_signals.append('has website')
# === LOST BUILDER DETECTION ===
# reddit is HIGH SIGNAL for lost builders - stuck in learnprogramming,
# imposter syndrome posts, "i wish i could" language, etc.
subreddits_list = list(sub_activity.keys())
lost_signals, lost_weight = analyze_reddit_for_lost_signals(activity, subreddits_list)
# also check full text for lost patterns (already done partially in analyze_reddit_for_lost_signals)
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
for sig in text_lost_signals:
if sig not in lost_signals:
lost_signals.append(sig)
lost_weight += text_lost_weight
lost_potential_score = lost_weight
# classify: builder, lost, both, or none
# for reddit, builder_score is based on having external links + high karma
builder_activity = 0
if external_links.get('github'):
builder_activity += 20
if total_karma > 1000:
builder_activity += 15
elif total_karma > 500:
builder_activity += 10
user_type = classify_user(lost_potential_score, builder_activity, total_score)
# confidence
confidence = 0.3
if len(activity) > 20:
confidence += 0.2
if len(aligned_subs) >= 2:
confidence += 0.2
if len(text_parts) > 10:
confidence += 0.2
# higher confidence if we have contact methods
if external_links:
confidence += 0.1
confidence = min(confidence, 0.95)
reasons = []
if aligned_subs:
reasons.append(f"active in: {', '.join(aligned_subs[:5])}")
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
if external_links:
reasons.append(f"external: {', '.join(external_links.keys())}")
# add lost reasons if applicable
if user_type == 'lost' or user_type == 'both':
lost_descriptions = get_signal_descriptions(lost_signals)
if lost_descriptions:
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
# determine if this is reddit-only (needs manual review)
reddit_only = len(external_links) == 0
if reddit_only:
reasons.append("REDDIT-ONLY: needs manual review for outreach")
return {
'platform': 'reddit',
'username': username,
'url': f"https://reddit.com/u/{username}",
'score': total_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'subreddits': aligned_subs,
'activity_count': len(activity),
'karma': total_karma,
'reasons': reasons,
'scraped_at': datetime.now().isoformat(),
# external platform links for outreach
'external_links': external_links,
'reddit_only': reddit_only,
'extra': {
'github': external_links.get('github'),
'mastodon': external_links.get('mastodon'),
'twitter': external_links.get('twitter'),
'bluesky': external_links.get('bluesky'),
'website': external_links.get('website'),
'matrix': external_links.get('matrix'),
'reddit_karma': total_karma,
'reddit_activity': len(activity),
},
# lost builder fields
'lost_potential_score': lost_potential_score,
'lost_signals': lost_signals,
'user_type': user_type,
}
def scrape_reddit(db, limit_per_sub=50):
"""
full reddit scrape - DISCOVERY ONLY
finds aligned users, extracts external links for outreach.
reddit-only users go to manual queue.
"""
print("scoutd/reddit: starting scrape (discovery only, not outreach)...")
# find users in multiple aligned subs
user_subs = defaultdict(set)
# aligned subs - active builders
priority_subs = ['intentionalcommunity', 'cohousing', 'selfhosted',
'homeassistant', 'solarpunk', 'cooperatives', 'privacy',
'localllama', 'homelab', 'degoogle', 'pihole', 'unraid']
# lost builder subs - people who need encouragement
# these folks might be stuck, but they have aligned interests
lost_subs = ['learnprogramming', 'findapath', 'getdisciplined',
'careerguidance', 'cscareerquestions', 'decidingtobebetter']
# scrape both - we want to find lost builders with aligned interests
all_subs = priority_subs + lost_subs
for sub in all_subs:
print(f" scraping r/{sub}...")
users = get_subreddit_users(sub, limit=limit_per_sub)
for user in users:
user_subs[user].add(sub)
print(f" found {len(users)} users")
# filter for multi-sub users
multi_sub = {u: subs for u, subs in user_subs.items() if len(subs) >= 2}
print(f" {len(multi_sub)} users in 2+ aligned subs")
# analyze
results = []
reddit_only_count = 0
external_link_count = 0
builders_found = 0
lost_found = 0
for username in multi_sub:
try:
result = analyze_reddit_user(username)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
user_type = result.get('user_type', 'none')
# track lost builders - reddit is high signal for these
if user_type == 'lost':
lost_found += 1
lost_score = result.get('lost_potential_score', 0)
if lost_score >= 40:
print(f" 💔 u/{username}: lost_score={lost_score}, values={result['score']} pts")
# lost builders also go to manual queue if reddit-only
if result.get('reddit_only'):
_add_to_manual_queue(result)
elif user_type == 'builder':
builders_found += 1
elif user_type == 'both':
builders_found += 1
lost_found += 1
print(f" ⚡ u/{username}: recovering builder")
# track external links
if result.get('reddit_only'):
reddit_only_count += 1
# add high-value users to manual queue for review
if result['score'] >= 50 and user_type != 'lost': # lost already added above
_add_to_manual_queue(result)
print(f" 📋 u/{username}: {result['score']} pts (reddit-only → manual queue)")
else:
external_link_count += 1
if result['score'] >= 50 and user_type == 'builder':
links = list(result.get('external_links', {}).keys())
print(f" ★ u/{username}: {result['score']} pts → {', '.join(links)}")
except Exception as e:
print(f" error on {username}: {e}")
print(f"scoutd/reddit: found {len(results)} aligned humans")
print(f" - {builders_found} active builders")
print(f" - {lost_found} lost builders (need encouragement)")
print(f" - {external_link_count} with external links (reachable)")
print(f" - {reddit_only_count} reddit-only (manual queue)")
return results
def _add_to_manual_queue(result):
"""add reddit-only user to manual queue for review"""
from pathlib import Path
import json
queue_file = Path(__file__).parent.parent / 'data' / 'manual_queue.json'
queue_file.parent.mkdir(parents=True, exist_ok=True)
queue = []
if queue_file.exists():
try:
queue = json.loads(queue_file.read_text())
except:
pass
# check if already in queue
existing = [q for q in queue if q.get('username') == result['username'] and q.get('platform') == 'reddit']
if existing:
return
queue.append({
'platform': 'reddit',
'username': result['username'],
'url': result['url'],
'score': result['score'],
'subreddits': result.get('subreddits', []),
'signals': result.get('signals', []),
'reasons': result.get('reasons', []),
'note': 'reddit-only user - no external links found. DM manually if promising.',
'queued_at': datetime.now().isoformat(),
'status': 'pending',
})
queue_file.write_text(json.dumps(queue, indent=2))

View file

@ -1,158 +0,0 @@
"""
shared signal patterns for all scrapers
"""
import re
# positive signals - what we're looking for
POSITIVE_PATTERNS = [
# values
(r'\b(solarpunk|cyberpunk)\b', 'solarpunk', 10),
(r'\b(anarchis[tm]|mutual.?aid)\b', 'mutual_aid', 10),
(r'\b(cooperative|collective|worker.?owned?|coop|co.?op)\b', 'cooperative', 15),
(r'\b(community|commons)\b', 'community', 5),
(r'\b(intentional.?community|cohousing|commune)\b', 'intentional_community', 20),
# queer-friendly
(r'\b(queer|lgbtq?|trans|nonbinary|enby|genderqueer)\b', 'queer', 15),
(r'\b(they/them|she/her|he/him|xe/xem|any.?pronouns)\b', 'pronouns', 10),
(r'\bblm\b', 'blm', 5),
(r'\b(acab|1312)\b', 'acab', 5),
# tech values
(r'\b(privacy|surveillance|anti.?surveillance)\b', 'privacy', 10),
(r'\b(self.?host(?:ed|ing)?|homelab|home.?server)\b', 'selfhosted', 15),
(r'\b(local.?first|offline.?first)\b', 'local_first', 15),
(r'\b(decentralized?|federation|federated|fediverse)\b', 'decentralized', 10),
(r'\b(foss|libre|open.?source|copyleft)\b', 'foss', 10),
(r'\b(home.?assistant|home.?automation)\b', 'home_automation', 10),
(r'\b(mesh|p2p|peer.?to.?peer)\b', 'p2p', 10),
(r'\b(matrix|xmpp|irc)\b', 'federated_chat', 5),
(r'\b(degoogle|de.?google)\b', 'degoogle', 10),
# location/availability
(r'\b(seattle|portland|pnw|cascadia|pacific.?northwest)\b', 'pnw', 20),
(r'\b(washington|oregon)\b', 'pnw_state', 10),
(r'\b(remote|anywhere|relocate|looking.?to.?move)\b', 'remote', 10),
# anti-capitalism
(r'\b(anti.?capitalis[tm]|post.?capitalis[tm]|degrowth)\b', 'anticapitalist', 10),
# neurodivergent (often overlaps with our values)
(r'\b(neurodivergent|adhd|autistic|autism)\b', 'neurodivergent', 5),
# technical skills (bonus for builders)
(r'\b(rust|go|python|typescript)\b', 'modern_lang', 3),
(r'\b(linux|bsd|nixos)\b', 'unix', 3),
(r'\b(kubernetes|docker|podman)\b', 'containers', 3),
]
# negative signals - red flags
NEGATIVE_PATTERNS = [
(r'\b(qanon|maga|trump|wwg1wga)\b', 'maga', -50),
(r'\b(covid.?hoax|plandemic|5g.?conspiracy)\b', 'conspiracy', -50),
(r'\b(nwo|illuminati|deep.?state)\b', 'conspiracy', -30),
(r'\b(anti.?vax|antivax)\b', 'antivax', -30),
(r'\b(sovereign.?citizen)\b', 'sovcit', -40),
(r'\b(crypto.?bro|web3|nft|blockchain|bitcoin|ethereum)\b', 'crypto', -15),
(r'\b(conservative|republican)\b', 'conservative', -20),
(r'\b(free.?speech.?absolutist)\b', 'freeze_peach', -20),
]
# target topics for repo discovery
TARGET_TOPICS = [
'local-first', 'self-hosted', 'privacy', 'mesh-network',
'cooperative', 'solarpunk', 'decentralized', 'p2p',
'fediverse', 'activitypub', 'matrix-org', 'homeassistant',
'esphome', 'open-source-hardware', 'right-to-repair',
'mutual-aid', 'commons', 'degoogle', 'privacy-tools',
]
# ecosystem repos - high signal contributors
ECOSYSTEM_REPOS = [
'home-assistant/core',
'esphome/esphome',
'matrix-org/synapse',
'LemmyNet/lemmy',
'mastodon/mastodon',
'owncast/owncast',
'nextcloud/server',
'immich-app/immich',
'jellyfin/jellyfin',
'navidrome/navidrome',
'paperless-ngx/paperless-ngx',
'actualbudget/actual',
'firefly-iii/firefly-iii',
'logseq/logseq',
'AppFlowy-IO/AppFlowy',
'siyuan-note/siyuan',
'anytype/anytype-ts',
'calcom/cal.com',
'plausible/analytics',
'umami-software/umami',
]
# aligned subreddits
ALIGNED_SUBREDDITS = {
'intentionalcommunity': 25,
'cohousing': 25,
'cooperatives': 20,
'solarpunk': 20,
'selfhosted': 15,
'homeassistant': 15,
'homelab': 10,
'privacy': 15,
'PrivacyGuides': 15,
'degoogle': 15,
'anticonsumption': 10,
'Frugal': 5,
'simpleliving': 5,
'Seattle': 10,
'Portland': 10,
'cascadia': 15,
'linux': 5,
'opensource': 10,
'FOSS': 10,
}
# negative subreddits
NEGATIVE_SUBREDDITS = [
'conspiracy', 'conservative', 'walkaway', 'louderwithcrowder',
'JordanPeterson', 'TimPool', 'NoNewNormal', 'LockdownSkepticism',
]
# high-signal mastodon instances
ALIGNED_INSTANCES = {
'tech.lgbt': 20,
'social.coop': 25,
'fosstodon.org': 10,
'hackers.town': 15,
'hachyderm.io': 10,
'infosec.exchange': 5,
}
def analyze_text(text):
"""
analyze text for signals
returns: (score, signals_found, negative_signals)
"""
if not text:
return 0, [], []
text = text.lower()
score = 0
signals = []
negatives = []
for pattern, signal_name, points in POSITIVE_PATTERNS:
if re.search(pattern, text, re.IGNORECASE):
score += points
signals.append(signal_name)
for pattern, signal_name, points in NEGATIVE_PATTERNS:
if re.search(pattern, text, re.IGNORECASE):
score += points # points are already negative
negatives.append(signal_name)
return score, list(set(signals)), list(set(negatives))

View file

@ -1,255 +0,0 @@
"""
scoutd/twitter.py - twitter/x discovery via nitter instances
scrapes nitter (twitter frontend) to find users posting about aligned topics
without needing twitter API access
nitter instances rotate to avoid rate limits
"""
import requests
import json
import time
import re
from datetime import datetime
from pathlib import Path
from bs4 import BeautifulSoup
from .signals import analyze_text
HEADERS = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'twitter'
# nitter instances (rotate through these)
NITTER_INSTANCES = [
'nitter.privacydev.net',
'nitter.poast.org',
'nitter.woodland.cafe',
'nitter.esmailelbob.xyz',
]
# hashtags to search
ALIGNED_HASHTAGS = [
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
'privacy', 'solarpunk', 'cooperative', 'mutualaid', 'localfirst',
'indieweb', 'smallweb', 'permacomputing', 'degrowth', 'techworkers',
]
_current_instance_idx = 0
def get_nitter_instance():
"""get current nitter instance, rotate on failure"""
global _current_instance_idx
return NITTER_INSTANCES[_current_instance_idx % len(NITTER_INSTANCES)]
def rotate_instance():
"""switch to next nitter instance"""
global _current_instance_idx
_current_instance_idx += 1
def _scrape_page(url, retries=3):
"""scrape a nitter page with instance rotation"""
for attempt in range(retries):
instance = get_nitter_instance()
full_url = url.replace('{instance}', instance)
# check cache
cache_key = f"{full_url}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
if time.time() - data.get('_cached_at', 0) < 3600:
return data.get('_html')
except:
pass
time.sleep(2) # rate limit
try:
resp = requests.get(full_url, headers=HEADERS, timeout=30)
if resp.status_code == 200:
cache_file.write_text(json.dumps({
'_cached_at': time.time(),
'_html': resp.text
}))
return resp.text
elif resp.status_code in [429, 503]:
print(f" nitter {instance} rate limited, rotating...")
rotate_instance()
else:
print(f" nitter error: {resp.status_code}")
return None
except Exception as e:
print(f" nitter {instance} error: {e}")
rotate_instance()
return None
def search_hashtag(hashtag):
"""search for tweets with hashtag"""
url = f"https://{{instance}}/search?q=%23{hashtag}&f=tweets"
html = _scrape_page(url)
if not html:
return []
soup = BeautifulSoup(html, 'html.parser')
tweets = []
for tweet_div in soup.select('.timeline-item'):
try:
username_elem = tweet_div.select_one('.username')
content_elem = tweet_div.select_one('.tweet-content')
fullname_elem = tweet_div.select_one('.fullname')
if username_elem and content_elem:
username = username_elem.text.strip().lstrip('@')
tweets.append({
'username': username,
'name': fullname_elem.text.strip() if fullname_elem else username,
'content': content_elem.text.strip(),
})
except Exception as e:
continue
return tweets
def get_user_profile(username):
"""get user profile from nitter"""
url = f"https://{{instance}}/{username}"
html = _scrape_page(url)
if not html:
return None
soup = BeautifulSoup(html, 'html.parser')
try:
bio_elem = soup.select_one('.profile-bio')
bio = bio_elem.text.strip() if bio_elem else ''
location_elem = soup.select_one('.profile-location')
location = location_elem.text.strip() if location_elem else ''
website_elem = soup.select_one('.profile-website a')
website = website_elem.get('href') if website_elem else ''
# get recent tweets for more signal
tweets = []
for tweet_div in soup.select('.timeline-item')[:10]:
content_elem = tweet_div.select_one('.tweet-content')
if content_elem:
tweets.append(content_elem.text.strip())
return {
'username': username,
'bio': bio,
'location': location,
'website': website,
'recent_tweets': tweets,
}
except Exception as e:
print(f" error parsing {username}: {e}")
return None
def analyze_twitter_user(username, profile=None):
"""analyze a twitter user for alignment"""
if not profile:
profile = get_user_profile(username)
if not profile:
return None
# collect text
text_parts = [profile.get('bio', '')]
text_parts.extend(profile.get('recent_tweets', []))
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# twitter is noisy, lower base confidence
confidence = 0.25
if len(positive_signals) >= 3:
confidence += 0.2
if profile.get('website'):
confidence += 0.1
if len(profile.get('recent_tweets', [])) >= 5:
confidence += 0.1
confidence = min(confidence, 0.7) # cap lower for twitter
reasons = []
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
return {
'platform': 'twitter',
'username': username,
'url': f"https://twitter.com/{username}",
'name': profile.get('name', username),
'bio': profile.get('bio'),
'location': profile.get('location'),
'score': text_score,
'confidence': confidence,
'signals': positive_signals,
'negative_signals': negative_signals,
'reasons': reasons,
'contact': {
'twitter': username,
'website': profile.get('website'),
},
'scraped_at': datetime.now().isoformat(),
}
def scrape_twitter(db, limit_per_hashtag=50):
"""full twitter scrape via nitter"""
print("scoutd/twitter: starting scrape via nitter...")
all_users = {}
for hashtag in ALIGNED_HASHTAGS:
print(f" #{hashtag}...")
tweets = search_hashtag(hashtag)
for tweet in tweets[:limit_per_hashtag]:
username = tweet.get('username')
if username and username not in all_users:
all_users[username] = {
'username': username,
'name': tweet.get('name'),
'hashtags': [hashtag],
}
elif username:
all_users[username]['hashtags'].append(hashtag)
print(f" found {len(tweets)} tweets")
# prioritize users in multiple hashtags
multi_hashtag = {u: d for u, d in all_users.items() if len(d.get('hashtags', [])) >= 2}
print(f" {len(multi_hashtag)} users in 2+ aligned hashtags")
# analyze
results = []
for username, data in list(multi_hashtag.items())[:100]: # limit to prevent rate limits
try:
result = analyze_twitter_user(username)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
if result['score'] >= 30:
print(f" ★ @{username}: {result['score']} pts")
except Exception as e:
print(f" error on {username}: {e}")
print(f"scoutd/twitter: found {len(results)} aligned humans")
return results

View file

@ -1,143 +0,0 @@
#!/usr/bin/env python3
"""
setup priority user - add yourself to get matches
usage:
python setup_user.py # interactive setup
python setup_user.py --show # show your profile
python setup_user.py --matches # show your matches
"""
import argparse
import json
from db import Database
from db.users import (init_users_table, add_priority_user, get_priority_users,
get_priority_user_matches)
def interactive_setup(db):
"""interactive priority user setup"""
print("=" * 60)
print("connectd priority user setup")
print("=" * 60)
print("\nlink your profiles so connectd can find matches for YOU\n")
name = input("name: ").strip()
email = input("email (for notifications): ").strip()
github = input("github username (optional): ").strip() or None
reddit = input("reddit username (optional): ").strip() or None
mastodon = input("mastodon handle e.g. user@instance (optional): ").strip() or None
lobsters = input("lobste.rs username (optional): ").strip() or None
matrix = input("matrix id e.g. @user:matrix.org (optional): ").strip() or None
location = input("location (e.g. seattle, remote): ").strip() or None
print("\nwhat are you interested in? (comma separated)")
print("examples: self-hosting, cooperatives, solarpunk, home automation")
interests_raw = input("interests: ").strip()
interests = [i.strip() for i in interests_raw.split(',')] if interests_raw else []
print("\nwhat kind of people are you looking to connect with?")
looking_for = input("looking for: ").strip() or None
user_data = {
'name': name,
'email': email,
'github': github,
'reddit': reddit,
'mastodon': mastodon,
'lobsters': lobsters,
'matrix': matrix,
'location': location,
'interests': interests,
'looking_for': looking_for,
}
user_id = add_priority_user(db.conn, user_data)
print(f"\n✓ added as priority user #{user_id}")
print("connectd will now find matches for you")
def show_profile(db):
"""show current priority user profile"""
users = get_priority_users(db.conn)
if not users:
print("no priority users configured")
print("run: python setup_user.py")
return
for user in users:
print("=" * 60)
print(f"priority user #{user['id']}: {user['name']}")
print("=" * 60)
print(f"email: {user['email']}")
if user['github']:
print(f"github: {user['github']}")
if user['reddit']:
print(f"reddit: {user['reddit']}")
if user['mastodon']:
print(f"mastodon: {user['mastodon']}")
if user['lobsters']:
print(f"lobsters: {user['lobsters']}")
if user['matrix']:
print(f"matrix: {user['matrix']}")
if user['location']:
print(f"location: {user['location']}")
if user['interests']:
interests = json.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
print(f"interests: {', '.join(interests)}")
if user['looking_for']:
print(f"looking for: {user['looking_for']}")
def show_matches(db):
"""show matches for priority user"""
users = get_priority_users(db.conn)
if not users:
print("no priority users configured")
return
for user in users:
print(f"\n=== matches for {user['name']} ===\n")
matches = get_priority_user_matches(db.conn, user['id'], limit=20)
if not matches:
print("no matches yet - run the daemon to discover people")
continue
for i, match in enumerate(matches, 1):
print(f"{i}. {match['username']} ({match['platform']})")
print(f" score: {match['overlap_score']:.0f}")
print(f" url: {match['url']}")
reasons = match.get('overlap_reasons', '[]')
if isinstance(reasons, str):
reasons = json.loads(reasons)
if reasons:
print(f" why: {reasons[0] if reasons else ''}")
print()
def main():
parser = argparse.ArgumentParser(description='setup priority user')
parser.add_argument('--show', action='store_true', help='show your profile')
parser.add_argument('--matches', action='store_true', help='show your matches')
args = parser.parse_args()
db = Database()
init_users_table(db.conn)
if args.show:
show_profile(db)
elif args.matches:
show_matches(db)
else:
interactive_setup(db)
db.close()
if __name__ == '__main__':
main()

View file

@ -334,18 +334,24 @@ def determine_best_contact(human):
"""
determine best contact method based on WHERE THEY'RE MOST ACTIVE
uses activity-based selection from groq_draft module
returns: (method, info, fallbacks)
uses activity-based selection - ranks by user's actual usage
"""
from introd.groq_draft import determine_contact_method as activity_based_contact
method, info = activity_based_contact(human)
method, info, fallbacks = activity_based_contact(human)
# convert github_issue info to dict format for delivery
if method == 'github_issue' and isinstance(info, str) and '/' in info:
parts = info.split('/', 1)
return method, {'owner': parts[0], 'repo': parts[1]}
def format_info(m, i):
if m == 'github_issue' and isinstance(i, str) and '/' in i:
parts = i.split('/', 1)
return {'owner': parts[0], 'repo': parts[1]}
return i
return method, info
info = format_info(method, info)
fallbacks = [(m, format_info(m, i)) for m, i in fallbacks]
return method, info, fallbacks
def deliver_intro(match_data, intro_draft, dry_run=False):
@ -362,8 +368,8 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
if already_contacted(recipient_id):
return False, "already contacted", None
# determine contact method
method, contact_info = determine_best_contact(recipient)
# determine contact method with fallbacks
method, contact_info, fallbacks = determine_best_contact(recipient)
log = load_delivery_log()
result = {
@ -423,9 +429,60 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
success = True
error = "added to manual queue"
# if failed and we have fallbacks, try them
if not success and fallbacks:
for fallback_method, fallback_info in fallbacks:
result['fallback_attempts'] = result.get('fallback_attempts', [])
result['fallback_attempts'].append({
'method': fallback_method,
'contact_info': fallback_info
})
fb_success = False
fb_error = None
if fallback_method == 'email':
subject = f"someone you might want to know - connectd"
fb_success, fb_error = send_email(fallback_info, subject, intro_draft, dry_run)
elif fallback_method == 'mastodon':
fb_success, fb_error = send_mastodon_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'bluesky':
fb_success, fb_error = send_bluesky_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'matrix':
fb_success, fb_error = send_matrix_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'lemmy':
from scoutd.lemmy import send_lemmy_dm
fb_success, fb_error = send_lemmy_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'discord':
from scoutd.discord import send_discord_dm
fb_success, fb_error = send_discord_dm(fallback_info, intro_draft, dry_run)
elif fallback_method == 'github_issue':
owner = fallback_info.get('owner')
repo = fallback_info.get('repo')
title = "community introduction from connectd"
github_body = f"""hey {recipient.get('name') or recipient.get('username')},
{intro_draft}
---
*automated introduction from connectd*
"""
fb_success, fb_error = create_github_issue(owner, repo, title, github_body, dry_run)
if fb_success:
success = True
method = fallback_method
contact_info = fallback_info
error = None
result['fallback_succeeded'] = fallback_method
break
else:
result['fallback_attempts'][-1]['error'] = fb_error
# log result
result['success'] = success
result['error'] = error
result['final_method'] = method
if success:
log['sent'].append(result)

View file

@ -1,10 +1,14 @@
"""
introd/draft.py - AI writes intro messages referencing both parties' work
now with interest system links
"""
import json
# intro template - transparent about being AI, neutral third party
# base URL for connectd profiles
CONNECTD_URL = "https://connectd.sudoxreboot.com"
# intro template - now with interest links
INTRO_TEMPLATE = """hi {recipient_name},
i'm an AI that connects isolated builders working on similar things.
@ -17,7 +21,8 @@ overlap: {overlap_summary}
thought you might benefit from knowing each other.
their work: {other_url}
their profile: {profile_url}
{interested_line}
no pitch. just connection. ignore if not useful.
@ -32,7 +37,7 @@ you: {recipient_summary}
overlap: {overlap_summary}
their work: {other_url}
their profile: {profile_url}
no pitch, just connection.
"""
@ -51,12 +56,18 @@ def summarize_human(human_data):
# signals/interests
signals = human_data.get('signals', [])
if isinstance(signals, str):
try:
signals = json.loads(signals)
except:
signals = []
# extra data
extra = human_data.get('extra', {})
if isinstance(extra, str):
try:
extra = json.loads(extra)
except:
extra = {}
# build summary based on available data
topics = extra.get('topics', [])
@ -103,7 +114,10 @@ def summarize_overlap(overlap_data):
"""generate overlap summary"""
reasons = overlap_data.get('overlap_reasons', [])
if isinstance(reasons, str):
try:
reasons = json.loads(reasons)
except:
reasons = []
if reasons:
return ' | '.join(reasons[:3])
@ -116,12 +130,14 @@ def summarize_overlap(overlap_data):
return "aligned values and interests"
def draft_intro(match_data, recipient='a'):
def draft_intro(match_data, recipient='a', recipient_token=None, interested_count=0):
"""
draft an intro message for a match
match_data: dict with human_a, human_b, overlap info
recipient: 'a' or 'b' - who receives this intro
recipient_token: token for the recipient (to track who clicked)
interested_count: how many people are already interested in the recipient
returns: dict with draft text, channel, metadata
"""
@ -135,19 +151,37 @@ def draft_intro(match_data, recipient='a'):
# get names
recipient_name = recipient_human.get('name') or recipient_human.get('username', 'friend')
other_name = other_human.get('name') or other_human.get('username', 'someone')
other_username = other_human.get('username', '')
# generate summaries
recipient_summary = summarize_human(recipient_human)
other_summary = summarize_human(other_human)
overlap_summary = summarize_overlap(match_data)
# other's url
other_url = other_human.get('url', '')
# build profile URL with token if available
if other_username:
profile_url = f"{CONNECTD_URL}/{other_username}"
if recipient_token:
profile_url += f"?t={recipient_token}"
else:
profile_url = other_human.get('url', '')
# interested line - tells them about their inbox
interested_line = ''
if recipient_token:
interested_url = f"{CONNECTD_URL}/interested/{recipient_token}"
if interested_count > 0:
interested_line = f"\n{interested_count} people already want to meet you: {interested_url}"
else:
interested_line = f"\nbe the first to connect: {interested_url}"
# determine best channel
contact = recipient_human.get('contact', {})
if isinstance(contact, str):
try:
contact = json.loads(contact)
except:
contact = {}
channel = None
channel_address = None
@ -156,15 +190,12 @@ def draft_intro(match_data, recipient='a'):
if contact.get('email'):
channel = 'email'
channel_address = contact['email']
# github issue/discussion
elif recipient_human.get('platform') == 'github':
channel = 'github'
channel_address = recipient_human.get('url')
# mastodon DM
elif recipient_human.get('platform') == 'mastodon':
channel = 'mastodon'
channel_address = recipient_human.get('username')
# reddit message
elif recipient_human.get('platform') == 'reddit':
channel = 'reddit'
channel_address = recipient_human.get('username')
@ -180,12 +211,13 @@ def draft_intro(match_data, recipient='a'):
# render draft
draft = template.format(
recipient_name=recipient_name.split()[0] if recipient_name else 'friend', # first name only
recipient_name=recipient_name.split()[0] if recipient_name else 'friend',
recipient_summary=recipient_summary,
other_name=other_name.split()[0] if other_name else 'someone',
other_summary=other_summary,
overlap_summary=overlap_summary,
other_url=other_url,
profile_url=profile_url,
interested_line=interested_line,
)
return {
@ -196,15 +228,16 @@ def draft_intro(match_data, recipient='a'):
'draft': draft,
'overlap_score': match_data.get('overlap_score', 0),
'match_id': match_data.get('id'),
'recipient_token': recipient_token,
}
def draft_intros_for_match(match_data):
def draft_intros_for_match(match_data, token_a=None, token_b=None, interested_a=0, interested_b=0):
"""
draft intros for both parties in a match
returns list of two intro dicts
"""
intro_a = draft_intro(match_data, recipient='a')
intro_b = draft_intro(match_data, recipient='b')
intro_a = draft_intro(match_data, recipient='a', recipient_token=token_a, interested_count=interested_a)
intro_b = draft_intro(match_data, recipient='b', recipient_token=token_b, interested_count=interested_b)
return [intro_a, intro_b]

View file

@ -1,437 +1,435 @@
"""
introd/groq_draft.py - groq llama 4 maverick for smart intro drafting
uses groq api to generate personalized, natural intro messages
that don't sound like ai-generated slop
connectd - groq message drafting
reads soul from file, uses as guideline for llm to personalize
"""
import os
import json
import requests
from datetime import datetime
from groq import Groq
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
GROQ_MODEL = os.getenv("GROQ_MODEL", "llama-3.3-70b-versatile")
client = Groq(api_key=GROQ_API_KEY) if GROQ_API_KEY else None
def determine_contact_method(human):
"""
determine best contact method based on WHERE THEY'RE MOST ACTIVE
don't use fixed hierarchy - analyze activity per platform:
- count posts/commits/activity
- weight by recency (last 30 days matters more)
- contact them where they already are
- fall back to email only if no social activity
"""
from datetime import datetime, timedelta
extra = human.get('extra', {})
if isinstance(extra, str):
extra = json.loads(extra) if extra else {}
# handle nested extra.extra from old save format
if 'extra' in extra and isinstance(extra['extra'], dict):
extra = {**extra, **extra['extra']}
contact = human.get('contact', {})
if isinstance(contact, str):
contact = json.loads(contact) if contact else {}
# collect activity scores per platform
activity_scores = {}
now = datetime.now()
thirty_days_ago = now - timedelta(days=30)
ninety_days_ago = now - timedelta(days=90)
# github activity
github_username = human.get('username') if human.get('platform') == 'github' else extra.get('github')
if github_username:
github_score = 0
top_repos = extra.get('top_repos', [])
for repo in top_repos:
# recent commits weight more
pushed_at = repo.get('pushed_at', '')
if pushed_at:
# load soul from file (guideline, not script)
SOUL_PATH = os.getenv("SOUL_PATH", "/app/soul.txt")
def load_soul():
try:
push_date = datetime.fromisoformat(pushed_at.replace('Z', '+00:00')).replace(tzinfo=None)
if push_date > thirty_days_ago:
github_score += 10 # very recent
elif push_date > ninety_days_ago:
github_score += 5 # somewhat recent
else:
github_score += 1 # old but exists
with open(SOUL_PATH, 'r') as f:
return f.read().strip()
except:
github_score += 1
return None
# stars indicate engagement
github_score += min(repo.get('stars', 0) // 10, 5)
# commit activity from deep scrape
commit_count = extra.get('commit_count', 0)
github_score += min(commit_count // 10, 20)
if github_score > 0:
activity_scores['github_issue'] = {
'score': github_score,
'info': f"{github_username}/{top_repos[0]['name']}" if top_repos else github_username
}
# mastodon activity
mastodon_handle = extra.get('mastodon') or contact.get('mastodon')
if mastodon_handle:
mastodon_score = 0
statuses_count = extra.get('mastodon_statuses', 0) or human.get('statuses_count', 0)
# high post count = active user
if statuses_count > 1000:
mastodon_score += 30
elif statuses_count > 500:
mastodon_score += 20
elif statuses_count > 100:
mastodon_score += 10
elif statuses_count > 0:
mastodon_score += 5
# platform bonus for fediverse (values-aligned)
mastodon_score += 10
# bonus if handle was discovered via rel="me" or similar verification
# (having a handle linked from their website = they want to be contacted there)
handles = extra.get('handles', {})
if handles.get('mastodon') == mastodon_handle:
mastodon_score += 15 # verified handle bonus
if mastodon_score > 0:
activity_scores['mastodon'] = {'score': mastodon_score, 'info': mastodon_handle}
# bluesky activity
bluesky_handle = extra.get('bluesky') or contact.get('bluesky')
if bluesky_handle:
bluesky_score = 0
posts_count = extra.get('bluesky_posts', 0) or human.get('posts_count', 0)
if posts_count > 500:
bluesky_score += 25
elif posts_count > 100:
bluesky_score += 15
elif posts_count > 0:
bluesky_score += 5
# newer platform, slightly lower weight
bluesky_score += 5
if bluesky_score > 0:
activity_scores['bluesky'] = {'score': bluesky_score, 'info': bluesky_handle}
# twitter activity
twitter_handle = extra.get('twitter') or contact.get('twitter')
if twitter_handle:
twitter_score = 0
tweets_count = extra.get('twitter_tweets', 0)
if tweets_count > 1000:
twitter_score += 20
elif tweets_count > 100:
twitter_score += 10
elif tweets_count > 0:
twitter_score += 5
# if we found them via twitter hashtags, they're active there
if human.get('platform') == 'twitter':
twitter_score += 15
if twitter_score > 0:
activity_scores['twitter'] = {'score': twitter_score, 'info': twitter_handle}
# NOTE: reddit is DISCOVERY ONLY, not a contact method
# we find users on reddit but reach out via their external links (github, mastodon, etc.)
# reddit-only users go to manual_queue for review
# lobsters activity
lobsters_username = extra.get('lobsters') or contact.get('lobsters')
if lobsters_username or human.get('platform') == 'lobsters':
lobsters_score = 0
lobsters_username = lobsters_username or human.get('username')
karma = extra.get('lobsters_karma', 0) or human.get('karma', 0)
# lobsters is invite-only, high signal
lobsters_score += 15
if karma > 100:
lobsters_score += 15
elif karma > 50:
lobsters_score += 10
elif karma > 0:
lobsters_score += 5
if lobsters_score > 0:
activity_scores['lobsters'] = {'score': lobsters_score, 'info': lobsters_username}
# matrix activity
matrix_id = extra.get('matrix') or contact.get('matrix')
if matrix_id:
matrix_score = 0
# matrix users are typically privacy-conscious and technical
matrix_score += 15 # platform bonus for decentralized chat
# bonus if handle was discovered via rel="me" verification
handles = extra.get('handles', {})
if handles.get('matrix') == matrix_id:
matrix_score += 10 # verified handle bonus
if matrix_score > 0:
activity_scores['matrix'] = {'score': matrix_score, 'info': matrix_id}
# lemmy activity (fediverse)
lemmy_username = human.get('username') if human.get('platform') == 'lemmy' else extra.get('lemmy')
if lemmy_username:
lemmy_score = 0
# lemmy is fediverse - high values alignment
lemmy_score += 20 # fediverse platform bonus
post_count = extra.get('post_count', 0)
comment_count = extra.get('comment_count', 0)
if post_count > 100:
lemmy_score += 15
elif post_count > 50:
lemmy_score += 10
elif post_count > 10:
lemmy_score += 5
if comment_count > 500:
lemmy_score += 10
elif comment_count > 100:
lemmy_score += 5
if lemmy_score > 0:
activity_scores['lemmy'] = {'score': lemmy_score, 'info': lemmy_username}
# pick highest activity platform
if activity_scores:
best_platform = max(activity_scores.items(), key=lambda x: x[1]['score'])
return best_platform[0], best_platform[1]['info']
# fall back to email ONLY if no social activity detected
email = extra.get('email') or contact.get('email')
# also check emails list
if not email:
emails = extra.get('emails') or contact.get('emails') or []
for e in emails:
if e and '@' in e and 'noreply' not in e.lower():
email = e
break
if email and '@' in email and 'noreply' not in email.lower():
return 'email', email
# last resort: manual
return 'manual', None
def draft_intro_with_llm(match_data, recipient='a', dry_run=False):
SIGNATURE_HTML = """
<div style="margin-top: 24px; padding-top: 16px; border-top: 1px solid #333;">
<div style="margin-bottom: 12px;">
<a href="https://github.com/sudoxnym/connectd" style="color: #8b5cf6; text-decoration: none; font-size: 14px;">github.com/sudoxnym/connectd</a>
<span style="color: #666; font-size: 12px; margin-left: 8px;">(main repo)</span>
</div>
<div style="display: flex; gap: 16px; align-items: center;">
<a href="https://github.com/connectd-daemon" title="GitHub" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg>
</a>
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M23.268 5.313c-.35-2.578-2.617-4.61-5.304-5.004C17.51.242 15.792 0 11.813 0h-.03c-3.98 0-4.835.242-5.288.309C3.882.692 1.496 2.518.917 5.127.64 6.412.61 7.837.661 9.143c.074 1.874.088 3.745.26 5.611.118 1.24.325 2.47.62 3.68.55 2.237 2.777 4.098 4.96 4.857 2.336.792 4.849.923 7.256.38.265-.061.527-.132.786-.213.585-.184 1.27-.39 1.774-.753a.057.057 0 0 0 .023-.043v-1.809a.052.052 0 0 0-.02-.041.053.053 0 0 0-.046-.01 20.282 20.282 0 0 1-4.709.545c-2.73 0-3.463-1.284-3.674-1.818a5.593 5.593 0 0 1-.319-1.433.053.053 0 0 1 .066-.054c1.517.363 3.072.546 4.632.546.376 0 .75 0 1.125-.01 1.57-.044 3.224-.124 4.768-.422.038-.008.077-.015.11-.024 2.435-.464 4.753-1.92 4.989-5.604.008-.145.03-1.52.03-1.67.002-.512.167-3.63-.024-5.545zm-3.748 9.195h-2.561V8.29c0-1.309-.55-1.976-1.67-1.976-1.23 0-1.846.79-1.846 2.35v3.403h-2.546V8.663c0-1.56-.617-2.35-1.848-2.35-1.112 0-1.668.668-1.67 1.977v6.218H4.822V8.102c0-1.31.337-2.35 1.011-3.12.696-.77 1.608-1.164 2.74-1.164 1.311 0 2.302.5 2.962 1.498l.638 1.06.638-1.06c.66-.999 1.65-1.498 2.96-1.498 1.13 0 2.043.395 2.74 1.164.675.77 1.012 1.81 1.012 3.12z"/></svg>
</a>
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M5.202 2.857C7.954 4.922 10.913 9.11 12 11.358c1.087-2.247 4.046-6.436 6.798-8.501C20.783 1.366 24 .213 24 3.883c0 .732-.42 6.156-.667 7.037-.856 3.061-3.978 3.842-6.755 3.37 4.854.826 6.089 3.562 3.422 6.299-5.065 5.196-7.28-1.304-7.847-2.97-.104-.305-.152-.448-.153-.327 0-.121-.05.022-.153.327-.568 1.666-2.782 8.166-7.847 2.97-2.667-2.737-1.432-5.473 3.422-6.3-2.777.473-5.899-.308-6.755-3.369C.42 10.04 0 4.615 0 3.883c0-3.67 3.217-2.517 5.202-1.026"/></svg>
</a>
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M2.9595 4.2228a3.9132 3.9132 0 0 0-.332.019c-.8781.1012-1.67.5699-2.155 1.3862-.475.8-.5922 1.6809-.35 2.4971.2421.8162.8297 1.5575 1.6982 2.1449.0053.0035.0106.0076.0163.0114.746.4498 1.492.7431 2.2877.8994-.02.3318-.0272.6689-.006 1.0181.0634 1.0432.4368 2.0006.996 2.8492l-2.0061.8189a.4163.4163 0 0 0-.2276.2239.416.416 0 0 0 .0879.455.415.415 0 0 0 .2941.1231.4156.4156 0 0 0 .1595-.0312l2.2093-.9035c.408.4859.8695.9315 1.3723 1.318.0196.0151.0407.0264.0603.0423l-1.2918 1.7103a.416.416 0 0 0 .664.501l1.314-1.7385c.7185.4548 1.4782.7927 2.2294 1.0242.3833.7209 1.1379 1.1871 2.0202 1.1871.8907 0 1.6442-.501 2.0242-1.2072.744-.2347 1.4959-.5729 2.2073-1.0262l1.332 1.7606a.4157.4157 0 0 0 .7439-.1936.4165.4165 0 0 0-.0799-.3074l-1.3099-1.7345c.0083-.0075.0178-.0113.0261-.0188.4968-.3803.9549-.8175 1.3622-1.2939l2.155.8794a.4156.4156 0 0 0 .5412-.2276.4151.4151 0 0 0-.2273-.5432l-1.9438-.7928c.577-.8538.9697-1.8183 1.0504-2.8693.0268-.3507.0242-.6914.0079-1.0262.7905-.1572 1.5321-.4502 2.2737-.8974.0053-.0033.011-.0076.0163-.0113.8684-.5874 1.456-1.3287 1.6982-2.145.2421-.8161.125-1.697-.3501-2.497-.4849-.8163-1.2768-1.2852-2.155-1.3863a3.2175 3.2175 0 0 0-.332-.0189c-.7852-.0151-1.6231.229-2.4286.6942-.5926.342-1.1252.867-1.5433 1.4387-1.1699-.6703-2.6923-1.0476-4.5635-1.0785a15.5768 15.5768 0 0 0-.5111 0c-2.085.034-3.7537.43-5.0142 1.1449-.0033-.0038-.0045-.0114-.008-.0152-.4233-.5916-.973-1.1365-1.5835-1.489-.8055-.465-1.6434-.7083-2.4286-.6941Zm.2858.7365c.5568.042 1.1696.2358 1.7787.5875.485.28.9757.7554 1.346 1.2696a5.6875 5.6875 0 0 0-.4969.4085c-.9201.8516-1.4615 1.9597-1.668 3.2335-.6809-.1402-1.3183-.3945-1.984-.7948-.7553-.5128-1.2159-1.1225-1.4004-1.7445-.1851-.624-.1074-1.2712.2776-1.9196.3743-.63.9275-.9534 1.6118-1.0322a2.796 2.796 0 0 1 .5352-.0076Zm17.5094 0a2.797 2.797 0 0 1 .5353.0075c.6842.0786 1.2374.4021 1.6117 1.0322.385.6484.4627 1.2957.2776 1.9196-.1845.622-.645 1.2317-1.4004 1.7445-.6578.3955-1.2881.6472-1.9598.7888-.1942-1.2968-.7375-2.4338-1.666-3.302a5.5639 5.5639 0 0 0-.4709-.3923c.3645-.49.8287-.9428 1.2938-1.2113.6091-.3515 1.2219-.5454 1.7787-.5875ZM12.006 6.0036a14.832 14.832 0 0 1 .487 0c2.3901.0393 4.0848.67 5.1631 1.678 1.1501 1.0754 1.6423 2.6006 1.499 4.467-.1311 1.7079-1.2203 3.2281-2.652 4.324-.694.5313-1.4626.9354-2.2254 1.2294.0031-.0453.014-.0888.014-.1349.0029-1.1964-.9313-2.2133-2.2918-2.2133-1.3606 0-2.3222 1.0154-2.2918 2.2213.0013.0507.014.0972.0181.1471-.781-.2933-1.5696-.7013-2.2777-1.2456-1.4239-1.0945-2.4997-2.6129-2.6037-4.322-.1129-1.8567.3778-3.3382 1.5212-4.3965C7.5094 6.7 9.352 6.047 12.006 6.0036Zm-3.6419 6.8291c-.6053 0-1.0966.4903-1.0966 1.0966 0 .6063.4913 1.0986 1.0966 1.0986s1.0966-.4923 1.0966-1.0986c0-.6063-.4913-1.0966-1.0966-1.0966zm7.2819.0113c-.5998 0-1.0866.4859-1.0866 1.0866s.4868 1.0885 1.0866 1.0885c.5997 0 1.0865-.4878 1.0865-1.0885s-.4868-1.0866-1.0865-1.0866zM12 16.0835c1.0237 0 1.5654.638 1.5634 1.4829-.0018.7849-.6723 1.485-1.5634 1.485-.9167 0-1.54-.5629-1.5634-1.493-.0212-.8347.5397-1.4749 1.5634-1.4749Z"/></svg>
</a>
<a href="https://discord.gg/connectd" title="Discord" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M20.317 4.3698a19.7913 19.7913 0 00-4.8851-1.5152.0741.0741 0 00-.0785.0371c-.211.3753-.4447.8648-.6083 1.2495-1.8447-.2762-3.68-.2762-5.4868 0-.1636-.3933-.4058-.8742-.6177-1.2495a.077.077 0 00-.0785-.037 19.7363 19.7363 0 00-4.8852 1.515.0699.0699 0 00-.0321.0277C.5334 9.0458-.319 13.5799.0992 18.0578a.0824.0824 0 00.0312.0561c2.0528 1.5076 4.0413 2.4228 5.9929 3.0294a.0777.0777 0 00.0842-.0276c.4616-.6304.8731-1.2952 1.226-1.9942a.076.076 0 00-.0416-.1057c-.6528-.2476-1.2743-.5495-1.8722-.8923a.077.077 0 01-.0076-.1277c.1258-.0943.2517-.1923.3718-.2914a.0743.0743 0 01.0776-.0105c3.9278 1.7933 8.18 1.7933 12.0614 0a.0739.0739 0 01.0785.0095c.1202.099.246.1981.3728.2924a.077.077 0 01-.0066.1276 12.2986 12.2986 0 01-1.873.8914.0766.0766 0 00-.0407.1067c.3604.698.7719 1.3628 1.225 1.9932a.076.076 0 00.0842.0286c1.961-.6067 3.9495-1.5219 6.0023-3.0294a.077.077 0 00.0313-.0552c.5004-5.177-.8382-9.6739-3.5485-13.6604a.061.061 0 00-.0312-.0286zM8.02 15.3312c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9555-2.4189 2.157-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.9555 2.4189-2.1569 2.4189zm7.9748 0c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9554-2.4189 2.1569-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.946 2.4189-2.1568 2.4189Z"/></svg>
</a>
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M.632.55v22.9H2.28V24H0V0h2.28v.55zm7.043 7.26v1.157h.033c.309-.443.683-.784 1.117-1.024.433-.245.936-.365 1.5-.365.54 0 1.033.107 1.481.314.448.208.785.582 1.02 1.108.254-.374.6-.706 1.034-.992.434-.287.95-.43 1.546-.43.453 0 .872.056 1.26.167.388.11.716.286.993.53.276.245.489.559.646.951.152.392.23.863.23 1.417v5.728h-2.349V11.52c0-.286-.01-.559-.032-.812a1.755 1.755 0 0 0-.18-.66 1.106 1.106 0 0 0-.438-.448c-.194-.11-.457-.166-.785-.166-.332 0-.6.064-.803.189a1.38 1.38 0 0 0-.48.499 1.946 1.946 0 0 0-.231.696 5.56 5.56 0 0 0-.06.785v4.768h-2.35v-4.8c0-.254-.004-.503-.018-.752a2.074 2.074 0 0 0-.143-.688 1.052 1.052 0 0 0-.415-.503c-.194-.125-.476-.19-.854-.19-.111 0-.259.024-.439.074-.18.051-.36.143-.53.282-.171.138-.319.337-.439.595-.12.259-.18.6-.18 1.02v4.966H5.46V7.81zm15.693 15.64V.55H21.72V0H24v24h-2.28v-.55z"/></svg>
</a>
<a href="https://reddit.com/r/connectd" title="Reddit" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 0C5.373 0 0 5.373 0 12c0 3.314 1.343 6.314 3.515 8.485l-2.286 2.286C.775 23.225 1.097 24 1.738 24H12c6.627 0 12-5.373 12-12S18.627 0 12 0Zm4.388 3.199c1.104 0 1.999.895 1.999 1.999 0 1.105-.895 2-1.999 2-.946 0-1.739-.657-1.947-1.539v.002c-1.147.162-2.032 1.15-2.032 2.341v.007c1.776.067 3.4.567 4.686 1.363.473-.363 1.064-.58 1.707-.58 1.547 0 2.802 1.254 2.802 2.802 0 1.117-.655 2.081-1.601 2.531-.088 3.256-3.637 5.876-7.997 5.876-4.361 0-7.905-2.617-7.998-5.87-.954-.447-1.614-1.415-1.614-2.538 0-1.548 1.255-2.802 2.803-2.802.645 0 1.239.218 1.712.585 1.275-.79 2.881-1.291 4.64-1.365v-.01c0-1.663 1.263-3.034 2.88-3.207.188-.911.993-1.595 1.959-1.595Zm-8.085 8.376c-.784 0-1.459.78-1.506 1.797-.047 1.016.64 1.429 1.426 1.429.786 0 1.371-.369 1.418-1.385.047-1.017-.553-1.841-1.338-1.841Zm7.406 0c-.786 0-1.385.824-1.338 1.841.047 1.017.634 1.385 1.418 1.385.785 0 1.473-.413 1.426-1.429-.046-1.017-.721-1.797-1.506-1.797Zm-3.703 4.013c-.974 0-1.907.048-2.77.135-.147.015-.241.168-.183.305.483 1.154 1.622 1.964 2.953 1.964 1.33 0 2.47-.81 2.953-1.964.057-.137-.037-.29-.184-.305-.863-.087-1.795-.135-2.769-.135Z"/></svg>
</a>
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color: #888; text-decoration: none;">
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M1.5 8.67v8.58a3 3 0 003 3h15a3 3 0 003-3V8.67l-8.928 5.493a3 3 0 01-3.144 0L1.5 8.67z"/><path d="M22.5 6.908V6.75a3 3 0 00-3-3h-15a3 3 0 00-3 3v.158l9.714 5.978a1.5 1.5 0 001.572 0L22.5 6.908z"/></svg>
</a>
</div>
</div>
"""
use groq llama 4 maverick to draft a personalized intro
match_data should contain:
- human_a: the first person
- human_b: the second person
- overlap_score: numeric score
- overlap_reasons: list of why they match
SIGNATURE_PLAINTEXT = """
---
github.com/sudoxnym/connectd (main repo)
recipient: 'a' or 'b' - who we're writing to
github: github.com/connectd-daemon
mastodon: @connectd@mastodon.sudoxreboot.com
bluesky: connectd.bsky.social
lemmy: lemmy.sudoxreboot.com/c/connectd
discord: discord.gg/connectd
matrix: @connectd:sudoxreboot.com
reddit: reddit.com/r/connectd
email: connectd@sudoxreboot.com
"""
if not GROQ_API_KEY:
def draft_intro_with_llm(match_data: dict, recipient: str = 'a', dry_run: bool = True, recipient_token: str = None, interested_count: int = 0):
"""
draft an intro message using groq llm.
args:
match_data: dict with human_a, human_b, overlap_score, overlap_reasons
recipient: 'a' or 'b' - who receives the message
dry_run: if True, preview mode
returns:
tuple (result_dict, error_string)
result_dict has: subject, draft_html, draft_plain
"""
if not client:
return None, "GROQ_API_KEY not set"
# determine recipient and other person
if recipient == 'a':
to_person = match_data.get('human_a', {})
other_person = match_data.get('human_b', {})
else:
to_person = match_data.get('human_b', {})
other_person = match_data.get('human_a', {})
# build context
to_name = to_person.get('name') or to_person.get('username', 'friend')
other_name = other_person.get('name') or other_person.get('username', 'someone')
to_signals = to_person.get('signals', [])
if isinstance(to_signals, str):
to_signals = json.loads(to_signals) if to_signals else []
other_signals = other_person.get('signals', [])
if isinstance(other_signals, str):
other_signals = json.loads(other_signals) if other_signals else []
overlap_reasons = match_data.get('overlap_reasons', [])
if isinstance(overlap_reasons, str):
overlap_reasons = json.loads(overlap_reasons) if overlap_reasons else []
# parse extra data
to_extra = to_person.get('extra', {})
other_extra = other_person.get('extra', {})
if isinstance(to_extra, str):
to_extra = json.loads(to_extra) if to_extra else {}
if isinstance(other_extra, str):
other_extra = json.loads(other_extra) if other_extra else {}
# build profile summaries
to_profile = f"""
name: {to_name}
platform: {to_person.get('platform', 'unknown')}
bio: {to_person.get('bio') or 'no bio'}
location: {to_person.get('location') or 'unknown'}
signals: {', '.join(to_signals[:8])}
repos: {len(to_extra.get('top_repos', []))} public repos
languages: {', '.join(to_extra.get('languages', {}).keys())}
"""
other_profile = f"""
name: {other_name}
platform: {other_person.get('platform', 'unknown')}
bio: {other_person.get('bio') or 'no bio'}
location: {other_person.get('location') or 'unknown'}
signals: {', '.join(other_signals[:8])}
repos: {len(other_extra.get('top_repos', []))} public repos
languages: {', '.join(other_extra.get('languages', {}).keys())}
url: {other_person.get('url', '')}
"""
# build prompt
system_prompt = """you are connectd, an ai that connects isolated builders who share values but don't know each other yet.
your job is to write a short, genuine intro message to one person about another person they might want to know.
rules:
- be brief (3-5 sentences max)
- be genuine, not salesy or fake
- focus on WHY they might want to connect, not just WHAT they have in common
- don't be cringe or use buzzwords
- lowercase preferred (casual tone)
- no emojis unless the person's profile suggests they'd like them
- mention specific things from their profiles, not generic "you both like open source"
- end with a simple invitation, not a hard sell
- sign off as "- connectd" (lowercase)
bad examples:
- "I noticed you're both passionate about..." (too formal)
- "You two would be PERFECT for each other!" (too salesy)
- "As a fellow privacy enthusiast..." (cringe)
good examples:
- "hey, saw you're building X. there's someone else working on similar stuff in Y who might be interesting to know."
- "you might want to check out Z's work on federated systems - similar approach to what you're doing with A."
"""
user_prompt = f"""write an intro message to {to_name} about {other_name}.
RECIPIENT ({to_name}):
{to_profile}
INTRODUCING ({other_name}):
{other_profile}
WHY THEY MATCH (overlap score {match_data.get('overlap_score', 0)}):
{', '.join(overlap_reasons[:5])}
write a short intro message. remember: lowercase, genuine, not salesy."""
try:
response = requests.post(
GROQ_API_URL,
headers={
'Authorization': f'Bearer {GROQ_API_KEY}',
'Content-Type': 'application/json',
},
json={
'model': MODEL,
'messages': [
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': user_prompt},
],
'temperature': 0.7,
'max_tokens': 300,
},
timeout=30,
human_a = match_data.get('human_a', {})
human_b = match_data.get('human_b', {})
reasons = match_data.get('overlap_reasons', [])
# recipient gets the message, about_person is who we're introducing them to
if recipient == 'a':
to_person = human_a
about_person = human_b
else:
to_person = human_b
about_person = human_a
to_name = to_person.get('username', 'friend')
about_name = about_person.get('username', 'someone')
about_bio = about_person.get('extra', {}).get('bio', '')
# extract contact info for about_person
about_extra = about_person.get('extra', {})
if isinstance(about_extra, str):
import json as _json
about_extra = _json.loads(about_extra) if about_extra else {}
about_contact = about_person.get('contact', {})
if isinstance(about_contact, str):
about_contact = _json.loads(about_contact) if about_contact else {}
# build contact link for about_person
about_platform = about_person.get('platform', '')
about_username = about_person.get('username', '')
contact_link = None
if about_platform == 'mastodon' and about_username:
if '@' in about_username:
parts = about_username.split('@')
if len(parts) >= 2:
contact_link = f"https://{parts[1]}/@{parts[0]}"
elif about_platform == 'github' and about_username:
contact_link = f"https://github.com/{about_username}"
elif about_extra.get('mastodon') or about_contact.get('mastodon'):
handle = about_extra.get('mastodon') or about_contact.get('mastodon')
if '@' in handle:
parts = handle.lstrip('@').split('@')
if len(parts) >= 2:
contact_link = f"https://{parts[1]}/@{parts[0]}"
elif about_extra.get('github') or about_contact.get('github'):
contact_link = f"https://github.com/{about_extra.get('github') or about_contact.get('github')}"
elif about_extra.get('email'):
contact_link = about_extra['email']
elif about_contact.get('email'):
contact_link = about_contact['email']
elif about_extra.get('website'):
contact_link = about_extra['website']
elif about_extra.get('external_links', {}).get('website'):
contact_link = about_extra['external_links']['website']
elif about_extra.get('extra', {}).get('website'):
contact_link = about_extra['extra']['website']
elif about_platform == 'reddit' and about_username:
contact_link = f"reddit.com/u/{about_username}"
if not contact_link:
contact_link = f"github.com/{about_username}" if about_username else "reach out via connectd"
# skip if no real contact method (just reddit or generic)
if contact_link.startswith('reddit.com') or contact_link == "reach out via connectd" or 'stackblitz' in contact_link:
return None, f"no real contact info for {about_name} - skipping draft"
# format the shared factors naturally
if reasons:
factor = ', '.join(reasons[:3]) if len(reasons) > 1 else reasons[0]
else:
factor = "shared values and interests"
# load soul as guideline
soul = load_soul()
if not soul:
return None, "could not load soul file"
# build the prompt - soul is GUIDELINE not script
prompt = f"""you are connectd, a daemon that finds isolated builders and connects them.
write a personal message TO {to_name} telling them about {about_name}.
here is the soul/spirit of what connectd is about - use this as a GUIDELINE for tone and message, NOT as a script to copy verbatim:
---
{soul}
---
key facts for this message:
- recipient: {to_name}
- introducing them to: {about_name}
- their shared interests/values: {factor}
- about {about_name}: {about_bio if about_bio else 'a builder like you'}
- HOW TO REACH {about_name}: {contact_link}
RULES:
1. say their name ONCE at start, then use "you"
2. MUST include how to reach {about_name}: {contact_link}
3. lowercase, raw, emotional - follow the soul
4. end with the contact link
return ONLY the message body. signature is added separately."""
response = client.chat.completions.create(
model=GROQ_MODEL,
messages=[{"role": "user", "content": prompt}],
temperature=0.6,
max_tokens=1200
)
if response.status_code != 200:
return None, f"groq api error: {response.status_code} - {response.text}"
body = response.choices[0].message.content.strip()
data = response.json()
draft = data['choices'][0]['message']['content'].strip()
# generate subject
subject_prompt = f"""generate a short, lowercase email subject for a message to {to_name} about connecting them with {about_name} over their shared interest in {factor}.
# determine contact method for recipient
contact_method, contact_info = determine_contact_method(to_person)
no corporate speak. no clickbait. raw and real.
examples:
- "found you, {to_name}"
- "you're not alone"
- "a door just opened"
- "{to_name}, there's someone you should meet"
return ONLY the subject line."""
subject_response = client.chat.completions.create(
model=GROQ_MODEL,
messages=[{"role": "user", "content": subject_prompt}],
temperature=0.9,
max_tokens=50
)
subject = subject_response.choices[0].message.content.strip().strip('"').strip("'")
# add profile link and interest section
profile_url = f"https://connectd.sudoxreboot.com/{about_name}"
if recipient_token:
profile_url += f"?t={recipient_token}"
profile_section_html = f"""
<div style="margin-top: 20px; padding: 16px; background: #2d1f3d; border: 1px solid #8b5cf6; border-radius: 8px;">
<div style="color: #c792ea; font-size: 14px; margin-bottom: 8px;">here's the profile we built for {about_name}:</div>
<a href="{profile_url}" style="color: #82aaff; font-size: 16px;">{profile_url}</a>
</div>
"""
profile_section_plain = f"""
---
here's the profile we built for {about_name}:
{profile_url}
"""
# add interested section if recipient has people wanting to chat
interest_section_html = ""
interest_section_plain = ""
if recipient_token and interested_count > 0:
interest_url = f"https://connectd.sudoxreboot.com/interested/{recipient_token}"
people_word = "person wants" if interested_count == 1 else "people want"
interest_section_html = f"""
<div style="margin-top: 12px; padding: 16px; background: #1f2d3d; border: 1px solid #0f8; border-radius: 8px;">
<div style="color: #0f8; font-size: 14px;">{interested_count} {people_word} to chat with you:</div>
<a href="{interest_url}" style="color: #82aaff; font-size: 14px;">{interest_url}</a>
</div>
"""
interest_section_plain = f"""
{interested_count} {people_word} to chat with you:
{interest_url}
"""
# format html
draft_html = f"<div style='font-family: monospace; white-space: pre-wrap; color: #e0e0e0; background: #1a1a1a; padding: 20px;'>{body}</div>{profile_section_html}{interest_section_html}{SIGNATURE_HTML}"
draft_plain = body + profile_section_plain + interest_section_plain + SIGNATURE_PLAINTEXT
return {
'draft': draft,
'model': MODEL,
'to': to_name,
'about': other_name,
'overlap_score': match_data.get('overlap_score', 0),
'contact_method': contact_method,
'contact_info': contact_info,
'generated_at': datetime.now().isoformat(),
'subject': subject,
'draft_html': draft_html,
'draft_plain': draft_plain
}, None
except Exception as e:
return None, f"groq error: {str(e)}"
return None, str(e)
def draft_intro_batch(matches, dry_run=False):
"""
draft intros for multiple matches
returns list of (match, intro_result, error) tuples
"""
results = []
for match in matches:
# draft for both directions
intro_a, err_a = draft_intro_with_llm(match, recipient='a', dry_run=dry_run)
intro_b, err_b = draft_intro_with_llm(match, recipient='b', dry_run=dry_run)
results.append({
'match': match,
'intro_to_a': intro_a,
'intro_to_b': intro_b,
'errors': [err_a, err_b],
})
return results
# for backwards compat with old code
def draft_message(person: dict, factor: str, platform: str = "email") -> dict:
"""legacy function - wraps new api"""
match_data = {
'human_a': {'username': 'recipient'},
'human_b': person,
'overlap_reasons': [factor]
}
result, error = draft_intro_with_llm(match_data, recipient='a')
if error:
raise ValueError(error)
return {
'subject': result['subject'],
'body_html': result['draft_html'],
'body_plain': result['draft_plain']
}
def test_groq_connection():
"""test that groq api is working"""
if not GROQ_API_KEY:
return False, "GROQ_API_KEY not set"
try:
response = requests.post(
GROQ_API_URL,
headers={
'Authorization': f'Bearer {GROQ_API_KEY}',
'Content-Type': 'application/json',
},
json={
'model': MODEL,
'messages': [{'role': 'user', 'content': 'say "ok" and nothing else'}],
'max_tokens': 10,
},
timeout=10,
)
if response.status_code == 200:
return True, "groq api working"
if __name__ == "__main__":
# test
test_data = {
'human_a': {'username': 'sudoxnym', 'extra': {'bio': 'building intentional communities'}},
'human_b': {'username': 'testuser', 'extra': {'bio': 'home assistant enthusiast'}},
'overlap_reasons': ['home-assistant', 'open source', 'community building']
}
result, error = draft_intro_with_llm(test_data, recipient='a')
if error:
print(f"error: {error}")
else:
return False, f"groq api error: {response.status_code}"
print(f"subject: {result['subject']}")
print(f"\nbody:\n{result['draft_plain']}")
except Exception as e:
return False, f"groq connection error: {str(e)}"
# contact method ranking - USAGE BASED
# we rank by where the person is MOST ACTIVE, not by our preference
def determine_contact_method(human):
"""
determine ALL available contact methods, ranked by USER'S ACTIVITY.
looks at activity metrics to decide where they're most engaged.
returns: (best_method, best_info, fallbacks)
where fallbacks is a list of (method, info) tuples in activity order
"""
import json
extra = human.get('extra', {})
contact = human.get('contact', {})
if isinstance(extra, str):
extra = json.loads(extra) if extra else {}
if isinstance(contact, str):
contact = json.loads(contact) if contact else {}
nested_extra = extra.get('extra', {})
platform = human.get('platform', '')
available = []
# === ACTIVITY SCORING ===
# each method gets scored by how active the user is there
# EMAIL - always medium priority (we cant measure activity)
email = extra.get('email') or contact.get('email') or nested_extra.get('email')
if email and '@' in str(email):
available.append(('email', email, 50)) # baseline score
# MASTODON - score by post count / followers
mastodon = extra.get('mastodon') or contact.get('mastodon') or nested_extra.get('mastodon')
if mastodon:
masto_activity = extra.get('mastodon_posts', 0) or extra.get('statuses_count', 0)
masto_score = min(100, 30 + (masto_activity // 10)) # 30 base + 1 per 10 posts
available.append(('mastodon', mastodon, masto_score))
# if they CAME FROM mastodon, thats their primary
if platform == 'mastodon':
handle = f"@{human.get('username')}"
instance = human.get('instance') or extra.get('instance') or ''
if instance:
handle = f"@{human.get('username')}@{instance}"
activity = extra.get('statuses_count', 0) or extra.get('activity_count', 0)
score = min(100, 50 + (activity // 5)) # higher base since its their home
# dont dupe
if not any(a[0] == 'mastodon' for a in available):
available.append(('mastodon', handle, score))
else:
# update score if this is higher
for i, (m, info, s) in enumerate(available):
if m == 'mastodon' and score > s:
available[i] = ('mastodon', handle, score)
# MATRIX - score by presence (binary for now)
matrix = extra.get('matrix') or contact.get('matrix') or nested_extra.get('matrix')
if matrix and ':' in str(matrix):
available.append(('matrix', matrix, 40))
# BLUESKY - score by followers/posts if available
bluesky = extra.get('bluesky') or contact.get('bluesky') or nested_extra.get('bluesky')
if bluesky:
bsky_activity = extra.get('bluesky_posts', 0)
bsky_score = min(100, 25 + (bsky_activity // 10))
available.append(('bluesky', bluesky, bsky_score))
# LEMMY - score by activity
lemmy = extra.get('lemmy') or contact.get('lemmy') or nested_extra.get('lemmy')
if lemmy:
lemmy_activity = extra.get('lemmy_posts', 0) or extra.get('lemmy_comments', 0)
lemmy_score = min(100, 30 + lemmy_activity)
available.append(('lemmy', lemmy, lemmy_score))
if platform == 'lemmy':
handle = human.get('username')
activity = extra.get('activity_count', 0)
score = min(100, 50 + activity)
if not any(a[0] == 'lemmy' for a in available):
available.append(('lemmy', handle, score))
# DISCORD - lower priority (hard to DM)
discord = extra.get('discord') or contact.get('discord') or nested_extra.get('discord')
if discord:
available.append(('discord', discord, 20))
# GITHUB ISSUE - for github users, score by repo activity
if platform == 'github':
top_repos = extra.get('top_repos', [])
if top_repos:
repo = top_repos[0] if isinstance(top_repos[0], str) else top_repos[0].get('name', '')
stars = extra.get('total_stars', 0)
repos_count = extra.get('repos_count', 0)
# active github user = higher issue score
gh_score = min(60, 20 + (stars // 100) + (repos_count // 5))
if repo:
available.append(('github_issue', f"{human.get('username')}/{repo}", gh_score))
# REDDIT - discovered people, use their other links
if platform == 'reddit':
reddit_activity = extra.get('reddit_activity', 0) or extra.get('activity_count', 0)
# reddit users we reach via their external links (email, mastodon, etc)
# boost their other methods if reddit is their main platform
for i, (m, info, score) in enumerate(available):
if m in ('email', 'mastodon', 'matrix', 'bluesky'):
# boost score for reddit-discovered users' external contacts
boost = min(30, reddit_activity // 3)
available[i] = (m, info, score + boost)
# sort by activity score (highest first)
available.sort(key=lambda x: x[2], reverse=True)
if not available:
return 'manual', None, []
best = available[0]
fallbacks = [(m, i) for m, i, p in available[1:]]
return best[0], best[1], fallbacks
def get_ranked_contact_methods(human):
"""
get all contact methods for a human, ranked by their activity.
"""
method, info, fallbacks = determine_contact_method(human)
if method == 'manual':
return []
return [(method, info)] + fallbacks

View file

@ -1,15 +1,20 @@
"""
matchd/overlap.py - find pairs with alignment
CRITICAL: blocks users with disqualifying negative signals (maga, conspiracy, conservative)
"""
import json
from .fingerprint import fingerprint_similarity
# signals that HARD BLOCK matching - no exceptions
DISQUALIFYING_SIGNALS = {'maga', 'conspiracy', 'conservative', 'antivax', 'sovcit'}
def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
"""
analyze overlap between two humans
returns overlap details: score, shared values, complementary skills
returns None if either has disqualifying signals
"""
# parse stored json if needed
signals_a = human_a.get('signals', [])
@ -20,13 +25,49 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
if isinstance(signals_b, str):
signals_b = json.loads(signals_b)
# === HARD BLOCK: check for disqualifying negative signals ===
neg_a = human_a.get('negative_signals', [])
if isinstance(neg_a, str):
neg_a = json.loads(neg_a) if neg_a else []
neg_b = human_b.get('negative_signals', [])
if isinstance(neg_b, str):
neg_b = json.loads(neg_b) if neg_b else []
# also check 'reasons' field for WARNING entries
reasons_a = human_a.get('reasons', '')
if isinstance(reasons_a, str) and 'WARNING' in reasons_a:
# extract signals from WARNING: x, y, z
import re
warn_match = re.search(r'WARNING[:\s]+([^"\]]+)', reasons_a)
if warn_match:
warn_signals = [s.strip().lower() for s in warn_match.group(1).split(',')]
neg_a = list(set(neg_a + warn_signals))
reasons_b = human_b.get('reasons', '')
if isinstance(reasons_b, str) and 'WARNING' in reasons_b:
import re
warn_match = re.search(r'WARNING[:\s]+([^"\]]+)', reasons_b)
if warn_match:
warn_signals = [s.strip().lower() for s in warn_match.group(1).split(',')]
neg_b = list(set(neg_b + warn_signals))
# block if either has disqualifying signals
disq_a = set(neg_a) & DISQUALIFYING_SIGNALS
disq_b = set(neg_b) & DISQUALIFYING_SIGNALS
if disq_a:
return None # blocked
if disq_b:
return None # blocked
extra_a = human_a.get('extra', {})
if isinstance(extra_a, str):
extra_a = json.loads(extra_a)
extra_a = json.loads(extra_a) if extra_a else {}
extra_b = human_b.get('extra', {})
if isinstance(extra_b, str):
extra_b = json.loads(extra_b)
extra_b = json.loads(extra_b) if extra_b else {}
# shared signals
shared_signals = list(set(signals_a) & set(signals_b))
@ -36,7 +77,7 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
topics_b = set(extra_b.get('topics', []))
shared_topics = list(topics_a & topics_b)
# complementary skills (what one has that the other doesn't)
# complementary skills
langs_a = set(extra_a.get('languages', {}).keys())
langs_b = set(extra_b.get('languages', {}).keys())
complementary_langs = list((langs_a - langs_b) | (langs_b - langs_a))
@ -68,38 +109,30 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
# calculate overlap score
base_score = 0
# shared values (most important)
base_score += len(shared_signals) * 10
# shared interests
base_score += len(shared_topics) * 5
# complementary skills bonus (they can help each other)
if complementary_langs:
base_score += min(len(complementary_langs), 5) * 3
# geographic bonus
if geographic_match:
base_score += 20
# fingerprint similarity if available
fp_score = 0
if fp_a and fp_b:
fp_score = fingerprint_similarity(fp_a, fp_b) * 50
total_score = base_score + fp_score
# build reasons
overlap_reasons = []
if shared_signals:
overlap_reasons.append(f"shared values: {', '.join(shared_signals[:5])}")
overlap_reasons.append(f"shared: {', '.join(shared_signals[:5])}")
if shared_topics:
overlap_reasons.append(f"shared interests: {', '.join(shared_topics[:5])}")
overlap_reasons.append(f"interests: {', '.join(shared_topics[:5])}")
if geo_reason:
overlap_reasons.append(geo_reason)
if complementary_langs:
overlap_reasons.append(f"complementary skills: {', '.join(complementary_langs[:5])}")
overlap_reasons.append(f"complementary: {', '.join(complementary_langs[:5])}")
return {
'overlap_score': total_score,
@ -114,36 +147,28 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
def is_same_person(human_a, human_b):
"""
check if two records might be the same person (cross-platform)
"""
# same platform = definitely different records
"""check if two records might be the same person (cross-platform)"""
if human_a['platform'] == human_b['platform']:
return False
# check username similarity
user_a = human_a.get('username', '').lower().split('@')[0]
user_b = human_b.get('username', '').lower().split('@')[0]
if user_a == user_b:
return True
# check if github username matches
contact_a = human_a.get('contact', {})
contact_b = human_b.get('contact', {})
if isinstance(contact_a, str):
contact_a = json.loads(contact_a)
contact_a = json.loads(contact_a) if contact_a else {}
if isinstance(contact_b, str):
contact_b = json.loads(contact_b)
contact_b = json.loads(contact_b) if contact_b else {}
# github cross-reference
if contact_a.get('github') and contact_a.get('github') == contact_b.get('github'):
return True
if contact_a.get('github') == user_b or contact_b.get('github') == user_a:
return True
# email cross-reference
if contact_a.get('email') and contact_a.get('email') == contact_b.get('email'):
return True

689
profile_page.py Normal file
View file

@ -0,0 +1,689 @@
#!/usr/bin/env python3
"""
profile page template and helpers for connectd
comprehensive "get to know" page showing ALL data
"""
import json
from urllib.parse import quote
PROFILE_HTML = """<!DOCTYPE html>
<html>
<head>
<title>{name} | connectd</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<style>
* {{ box-sizing: border-box; margin: 0; padding: 0; }}
body {{
font-family: 'SF Mono', 'Monaco', 'Inconsolata', monospace;
background: #0a0a0f;
color: #e0e0e0;
line-height: 1.6;
}}
.container {{ max-width: 900px; margin: 0 auto; padding: 20px; }}
/* header */
.header {{
display: flex;
gap: 24px;
align-items: flex-start;
padding: 30px;
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
border-radius: 12px;
margin-bottom: 24px;
border: 1px solid #333;
}}
.avatar {{
width: 120px;
height: 120px;
border-radius: 50%;
background: linear-gradient(135deg, #c792ea 0%, #82aaff 100%);
display: flex;
align-items: center;
justify-content: center;
font-size: 48px;
color: #0a0a0f;
font-weight: bold;
flex-shrink: 0;
}}
.avatar img {{ width: 100%; height: 100%; border-radius: 50%; object-fit: cover; }}
.header-info {{ flex: 1; }}
.name {{ font-size: 2em; color: #c792ea; margin-bottom: 4px; }}
.username {{ color: #82aaff; font-size: 1.1em; margin-bottom: 8px; }}
.location {{ color: #0f8; margin-bottom: 8px; }}
.pronouns {{
display: inline-block;
background: #2d3a4a;
padding: 2px 10px;
border-radius: 12px;
font-size: 0.85em;
color: #f7c;
}}
.score-badge {{
display: inline-block;
background: linear-gradient(135deg, #c792ea 0%, #f7c 100%);
color: #0a0a0f;
padding: 4px 12px;
border-radius: 20px;
font-weight: bold;
margin-left: 12px;
}}
.user-type {{
display: inline-block;
padding: 2px 10px;
border-radius: 12px;
font-size: 0.85em;
margin-left: 8px;
}}
.user-type.builder {{ background: #2d4a2d; color: #8f8; }}
.user-type.lost {{ background: #4a2d2d; color: #f88; }}
.user-type.none {{ background: #333; color: #888; }}
/* bio section */
.bio {{
background: #1a1a2e;
padding: 24px;
border-radius: 12px;
margin-bottom: 24px;
border: 1px solid #333;
font-size: 1.1em;
color: #ddd;
font-style: italic;
}}
.bio:empty {{ display: none; }}
/* sections */
.section {{
background: #1a1a2e;
border-radius: 12px;
margin-bottom: 20px;
border: 1px solid #333;
overflow: hidden;
}}
.section-header {{
background: #2a2a4e;
padding: 14px 20px;
color: #82aaff;
font-size: 1.1em;
cursor: pointer;
display: flex;
justify-content: space-between;
align-items: center;
}}
.section-header:hover {{ background: #3a3a5e; }}
.section-header .toggle {{ color: #666; }}
.section-content {{ padding: 20px; }}
.section-content.collapsed {{ display: none; }}
/* platforms/handles */
.platforms {{
display: flex;
flex-wrap: wrap;
gap: 12px;
}}
.platform {{
display: flex;
align-items: center;
gap: 8px;
background: #0d0d15;
padding: 10px 16px;
border-radius: 8px;
border: 1px solid #333;
}}
.platform:hover {{ border-color: #0f8; }}
.platform-icon {{ font-size: 1.2em; }}
.platform a {{ color: #82aaff; text-decoration: none; }}
.platform a:hover {{ color: #0f8; }}
.platform-main {{ color: #c792ea; font-weight: bold; }}
/* signals/tags */
.tags {{
display: flex;
flex-wrap: wrap;
gap: 8px;
}}
.tag {{
background: #2d3a4a;
color: #82aaff;
padding: 6px 14px;
border-radius: 20px;
font-size: 0.9em;
cursor: pointer;
transition: all 0.2s;
}}
.tag:hover {{ background: #3d4a5a; transform: scale(1.05); }}
.tag.positive {{ background: #2d4a2d; color: #8f8; }}
.tag.negative {{ background: #4a2d2d; color: #f88; }}
.tag.rare {{ background: linear-gradient(135deg, #c792ea 0%, #f7c 100%); color: #0a0a0f; }}
.tag-detail {{
display: none;
background: #0d0d15;
padding: 10px;
border-radius: 6px;
margin-top: 8px;
font-size: 0.85em;
color: #888;
}}
/* repos */
.repos {{ display: flex; flex-direction: column; gap: 12px; }}
.repo {{
background: #0d0d15;
padding: 16px;
border-radius: 8px;
border: 1px solid #333;
}}
.repo:hover {{ border-color: #c792ea; }}
.repo-header {{ display: flex; justify-content: space-between; align-items: center; margin-bottom: 8px; }}
.repo-name {{ color: #c792ea; font-weight: bold; }}
.repo-name a {{ color: #c792ea; text-decoration: none; }}
.repo-name a:hover {{ color: #f7c; }}
.repo-stats {{ display: flex; gap: 16px; }}
.repo-stat {{ color: #888; font-size: 0.85em; }}
.repo-stat .star {{ color: #ffd700; }}
.repo-desc {{ color: #aaa; font-size: 0.9em; }}
.repo-lang {{
display: inline-block;
background: #333;
padding: 2px 8px;
border-radius: 4px;
font-size: 0.8em;
color: #0f8;
}}
/* languages */
.languages {{ display: flex; flex-wrap: wrap; gap: 8px; }}
.lang {{
background: #0d0d15;
padding: 8px 14px;
border-radius: 6px;
border: 1px solid #333;
}}
.lang-name {{ color: #0f8; }}
.lang-count {{ color: #666; font-size: 0.85em; margin-left: 6px; }}
/* subreddits */
.subreddits {{ display: flex; flex-wrap: wrap; gap: 8px; }}
.subreddit {{
background: #ff4500;
color: white;
padding: 6px 12px;
border-radius: 20px;
font-size: 0.9em;
}}
.subreddit a {{ color: white; text-decoration: none; }}
/* matches */
.match-summary {{
display: flex;
gap: 20px;
flex-wrap: wrap;
}}
.match-stat {{
background: #0d0d15;
padding: 16px 24px;
border-radius: 8px;
text-align: center;
}}
.match-stat b {{ font-size: 2em; color: #c792ea; display: block; }}
.match-stat small {{ color: #666; }}
/* raw data */
.raw-data {{
background: #0d0d15;
padding: 16px;
border-radius: 8px;
overflow-x: auto;
font-size: 0.85em;
color: #888;
}}
pre {{ white-space: pre-wrap; word-break: break-all; }}
/* contact */
.contact-methods {{ display: flex; flex-direction: column; gap: 12px; }}
.contact-method {{
display: flex;
align-items: center;
gap: 12px;
background: #0d0d15;
padding: 14px 20px;
border-radius: 8px;
border: 1px solid #333;
}}
.contact-method.preferred {{ border-color: #0f8; background: #1a2a1a; }}
.contact-method a {{ color: #82aaff; text-decoration: none; }}
.contact-method a:hover {{ color: #0f8; }}
/* reasons */
.reasons {{ display: flex; flex-direction: column; gap: 8px; }}
.reason {{
background: #0d0d15;
padding: 10px 14px;
border-radius: 6px;
color: #aaa;
font-size: 0.9em;
border-left: 3px solid #c792ea;
}}
/* back link */
.back {{
display: inline-block;
color: #666;
text-decoration: none;
margin-bottom: 20px;
}}
.back:hover {{ color: #0f8; }}
/* footer */
.footer {{
text-align: center;
padding: 30px;
color: #444;
font-size: 0.85em;
}}
.footer a {{ color: #666; }}
/* responsive */
@media (max-width: 600px) {{
.header {{ flex-direction: column; align-items: center; text-align: center; }}
.avatar {{ width: 100px; height: 100px; }}
.name {{ font-size: 1.5em; }}
}}
</style>
</head>
<body>
<div class="container">
<a href="/" class="back"> back to dashboard</a>
<!-- HEADER -->
<div class="header">
<div class="avatar">{avatar}</div>
<div class="header-info">
<div class="name">
{name}
<span class="score-badge">{score}</span>
<span class="user-type {user_type_class}">{user_type}</span>
</div>
<div class="username">@{username} on {platform}</div>
{location_html}
{pronouns_html}
</div>
</div>
<!-- BIO -->
<div class="bio">{bio}</div>
<!-- WHERE TO FIND THEM -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>🌐 where to find them</span>
<span class="toggle"></span>
</div>
<div class="section-content">
<div class="platforms">
{platforms_html}
</div>
</div>
</div>
<!-- WHAT THEY BUILD -->
{repos_section}
<!-- WHAT THEY CARE ABOUT -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>💜 what they care about ({signal_count} signals)</span>
<span class="toggle"></span>
</div>
<div class="section-content">
<div class="tags">
{signals_html}
</div>
{negative_signals_html}
</div>
</div>
<!-- WHY THEY SCORED -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>📊 why they scored {score}</span>
<span class="toggle"></span>
</div>
<div class="section-content">
<div class="reasons">
{reasons_html}
</div>
</div>
</div>
<!-- COMMUNITIES -->
{communities_section}
<!-- MATCHING -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>🤝 in the network</span>
<span class="toggle"></span>
</div>
<div class="section-content">
<div class="match-summary">
<div class="match-stat">
<b>{match_count}</b>
<small>matches</small>
</div>
<div class="match-stat">
<b>{lost_score}</b>
<small>lost potential</small>
</div>
</div>
</div>
</div>
<!-- CONTACT -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>📬 how to connect</span>
<span class="toggle"></span>
</div>
<div class="section-content">
{contact_html}
</div>
</div>
<!-- RAW DATA -->
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>🔍 the data (everything connectd knows)</span>
<span class="toggle"></span>
</div>
<div class="section-content collapsed">
<p style="color: #666; margin-bottom: 16px;">
public data is public. this is everything we've gathered from public sources.
</p>
<div class="raw-data">
<pre>{raw_json}</pre>
</div>
</div>
</div>
<div class="footer">
connectd · public data is public ·
<a href="/api/humans/{id}/full">raw json</a>
</div>
</div>
<script>
function toggleSection(header) {{
var content = header.nextElementSibling;
var toggle = header.querySelector('.toggle');
if (content.classList.contains('collapsed')) {{
content.classList.remove('collapsed');
toggle.textContent = '';
}} else {{
content.classList.add('collapsed');
toggle.textContent = '';
}}
}}
</script>
</body>
</html>
"""
RARE_SIGNALS = {'queer', 'solarpunk', 'cooperative', 'intentional_community', 'trans', 'nonbinary'}
def parse_json_field(val):
"""safely parse json string or return as-is"""
if isinstance(val, str):
try:
return json.loads(val)
except:
return val
return val or {}
def render_profile(human, match_count=0):
"""render full profile page for a human"""
# parse json fields
signals = parse_json_field(human.get('signals', '[]'))
if isinstance(signals, str):
signals = []
negative_signals = parse_json_field(human.get('negative_signals', '[]'))
if isinstance(negative_signals, str):
negative_signals = []
reasons = parse_json_field(human.get('reasons', '[]'))
if isinstance(reasons, str):
reasons = []
contact = parse_json_field(human.get('contact', '{}'))
extra = parse_json_field(human.get('extra', '{}'))
# nested extra sometimes
if 'extra' in extra:
extra = {**extra, **parse_json_field(extra['extra'])}
# basic info
name = human.get('name') or human.get('username', 'unknown')
username = human.get('username', 'unknown')
platform = human.get('platform', 'unknown')
bio = human.get('bio', '')
location = human.get('location') or extra.get('location', '')
score = human.get('score', 0)
user_type = human.get('user_type', 'none')
lost_score = human.get('lost_potential_score', 0)
# avatar - first letter or image
avatar_html = name[0].upper() if name else '?'
avatar_url = extra.get('avatar_url') or extra.get('profile_image')
if avatar_url:
avatar_html = f'<img src="{avatar_url}" alt="{name}">'
# location html
location_html = f'<div class="location">📍 {location}</div>' if location else ''
# pronouns - try to detect
pronouns = extra.get('pronouns', '')
if not pronouns and bio:
bio_lower = bio.lower()
if 'she/her' in bio_lower:
pronouns = 'she/her'
elif 'he/him' in bio_lower:
pronouns = 'he/him'
elif 'they/them' in bio_lower:
pronouns = 'they/them'
pronouns_html = f'<span class="pronouns">{pronouns}</span>' if pronouns else ''
# platforms/handles
handles = extra.get('handles', {})
platforms_html = []
# main platform
if platform == 'github':
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">💻</span><a href="https://github.com/{username}" target="_blank">github.com/{username}</a></div>')
elif platform == 'reddit':
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🔴</span><a href="https://reddit.com/u/{username}" target="_blank">u/{username}</a></div>')
elif platform == 'mastodon':
instance = human.get('instance', 'mastodon.social')
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🐘</span><a href="https://{instance}/@{username}" target="_blank">@{username}@{instance}</a></div>')
elif platform == 'lobsters':
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🦞</span><a href="https://lobste.rs/u/{username}" target="_blank">lobste.rs/u/{username}</a></div>')
# other handles
if handles.get('github') and platform != 'github':
platforms_html.append(f'<div class="platform"><span class="platform-icon">💻</span><a href="https://github.com/{handles["github"]}" target="_blank">github.com/{handles["github"]}</a></div>')
if handles.get('twitter'):
t = handles['twitter'].lstrip('@')
platforms_html.append(f'<div class="platform"><span class="platform-icon">🐦</span><a href="https://twitter.com/{t}" target="_blank">@{t}</a></div>')
if handles.get('mastodon') and platform != 'mastodon':
platforms_html.append(f'<div class="platform"><span class="platform-icon">🐘</span>{handles["mastodon"]}</div>')
if handles.get('bluesky'):
platforms_html.append(f'<div class="platform"><span class="platform-icon">🦋</span>{handles["bluesky"]}</div>')
if handles.get('linkedin'):
platforms_html.append(f'<div class="platform"><span class="platform-icon">💼</span><a href="https://linkedin.com/in/{handles["linkedin"]}" target="_blank">linkedin</a></div>')
if handles.get('matrix'):
platforms_html.append(f'<div class="platform"><span class="platform-icon">💬</span>{handles["matrix"]}</div>')
# contact methods
if contact.get('blog'):
platforms_html.append(f'<div class="platform"><span class="platform-icon">🌐</span><a href="{contact["blog"]}" target="_blank">{contact["blog"]}</a></div>')
# signals html
signals_html = []
for sig in signals:
cls = 'tag'
if sig in RARE_SIGNALS:
cls = 'tag rare'
signals_html.append(f'<span class="{cls}">{sig}</span>')
# negative signals
negative_signals_html = ''
if negative_signals:
neg_tags = ' '.join([f'<span class="tag negative">{s}</span>' for s in negative_signals])
negative_signals_html = f'<div style="margin-top: 16px;"><small style="color: #666;">negative signals:</small><br><div class="tags" style="margin-top: 8px;">{neg_tags}</div></div>'
# reasons html
reasons_html = '\n'.join([f'<div class="reason">{r}</div>' for r in reasons]) if reasons else '<div class="reason">no specific reasons recorded</div>'
# repos section
repos_section = ''
top_repos = extra.get('top_repos', [])
languages = extra.get('languages', {})
repo_count = extra.get('repo_count', 0)
total_stars = extra.get('total_stars', 0)
if top_repos or languages:
repos_html = ''
if top_repos:
for repo in top_repos[:6]:
repo_name = repo.get('name', 'unknown')
repo_desc = repo.get('description', '')[:200] or 'no description'
repo_stars = repo.get('stars', 0)
repo_lang = repo.get('language', '')
lang_badge = f'<span class="repo-lang">{repo_lang}</span>' if repo_lang else ''
repos_html += f'''
<div class="repo">
<div class="repo-header">
<span class="repo-name"><a href="https://github.com/{username}/{repo_name}" target="_blank">{repo_name}</a></span>
<div class="repo-stats">
<span class="repo-stat"><span class="star"></span> {repo_stars:,}</span>
{lang_badge}
</div>
</div>
<div class="repo-desc">{repo_desc}</div>
</div>
'''
# languages
langs_html = ''
if languages:
sorted_langs = sorted(languages.items(), key=lambda x: x[1], reverse=True)[:10]
for lang, count in sorted_langs:
langs_html += f'<div class="lang"><span class="lang-name">{lang}</span><span class="lang-count">×{count}</span></div>'
repos_section = f'''
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>🔨 what they build ({repo_count} repos, {total_stars:,} )</span>
<span class="toggle"></span>
</div>
<div class="section-content">
<div class="languages" style="margin-bottom: 16px;">
{langs_html}
</div>
<div class="repos">
{repos_html}
</div>
</div>
</div>
'''
# communities section (subreddits, etc)
communities_section = ''
subreddits = extra.get('subreddits', [])
topics = extra.get('topics', [])
if subreddits or topics:
subs_html = ''
if subreddits:
subs_html = '<div style="margin-bottom: 16px;"><small style="color: #666;">subreddits:</small><div class="subreddits" style="margin-top: 8px;">'
for sub in subreddits:
subs_html += f'<span class="subreddit"><a href="https://reddit.com/r/{sub}" target="_blank">r/{sub}</a></span>'
subs_html += '</div></div>'
topics_html = ''
if topics:
topics_html = '<div><small style="color: #666;">topics:</small><div class="tags" style="margin-top: 8px;">'
for topic in topics:
topics_html += f'<span class="tag">{topic}</span>'
topics_html += '</div></div>'
communities_section = f'''
<div class="section">
<div class="section-header" onclick="toggleSection(this)">
<span>👥 communities</span>
<span class="toggle"></span>
</div>
<div class="section-content">
{subs_html}
{topics_html}
</div>
</div>
'''
# contact section
contact_html = '<div class="contact-methods">'
emails = contact.get('emails', [])
if contact.get('email') and contact['email'] not in emails:
emails = [contact['email']] + emails
if emails:
for i, email in enumerate(emails[:3]):
preferred = 'preferred' if i == 0 else ''
contact_html += f'<div class="contact-method {preferred}"><span>📧</span><a href="mailto:{email}">{email}</a></div>'
if contact.get('mastodon'):
contact_html += f'<div class="contact-method"><span>🐘</span>{contact["mastodon"]}</div>'
if contact.get('matrix'):
contact_html += f'<div class="contact-method"><span>💬</span>{contact["matrix"]}</div>'
if contact.get('twitter'):
contact_html += f'<div class="contact-method"><span>🐦</span>@{contact["twitter"]}</div>'
if not emails and not contact.get('mastodon') and not contact.get('matrix'):
contact_html += '<div class="contact-method">no contact methods discovered</div>'
contact_html += '</div>'
# raw json
raw_json = json.dumps(human, indent=2, default=str)
# render
return PROFILE_HTML.format(
name=name,
username=username,
platform=platform,
bio=bio,
score=int(score),
user_type=user_type,
user_type_class=user_type,
avatar=avatar_html,
location_html=location_html,
pronouns_html=pronouns_html,
platforms_html='\n'.join(platforms_html),
signals_html='\n'.join(signals_html),
signal_count=len(signals),
negative_signals_html=negative_signals_html,
reasons_html=reasons_html,
repos_section=repos_section,
communities_section=communities_section,
match_count=match_count,
lost_score=int(lost_score),
contact_html=contact_html,
raw_json=raw_json,
id=human.get('id', 0)
)

View file

@ -1,2 +1,3 @@
requests>=2.28.0
beautifulsoup4>=4.12.0
groq>=0.4.0

491
scoutd/forges.py Normal file
View file

@ -0,0 +1,491 @@
"""
scoutd/forges.py - scrape self-hosted git forges
these people = highest signal. they actually selfhost.
supported platforms:
- gitea (and forks like forgejo)
- gogs
- gitlab ce
- sourcehut
- codeberg (gitea-based)
scrapes users AND extracts contact info for outreach.
"""
import os
import re
import json
import time
import requests
from typing import List, Dict, Optional, Tuple
from datetime import datetime
from .signals import analyze_text
# rate limiting
REQUEST_DELAY = 1.0
# known public instances to scrape
# format: (name, url, platform_type)
KNOWN_INSTANCES = [
# === PUBLIC INSTANCES ===
# local/private instances can be added via LOCAL_FORGE_INSTANCES env var
# codeberg (largest gitea instance)
('codeberg', 'https://codeberg.org', 'gitea'),
# sourcehut
('sourcehut', 'https://sr.ht', 'sourcehut'),
# notable gitea/forgejo instances
('gitea.com', 'https://gitea.com', 'gitea'),
('git.disroot.org', 'https://git.disroot.org', 'gitea'),
('git.gay', 'https://git.gay', 'forgejo'),
('git.envs.net', 'https://git.envs.net', 'forgejo'),
('tildegit', 'https://tildegit.org', 'gitea'),
('git.sr.ht', 'https://git.sr.ht', 'sourcehut'),
# gitlab ce instances
('framagit', 'https://framagit.org', 'gitlab'),
('gitlab.gnome.org', 'https://gitlab.gnome.org', 'gitlab'),
('invent.kde.org', 'https://invent.kde.org', 'gitlab'),
('salsa.debian.org', 'https://salsa.debian.org', 'gitlab'),
]
# headers
HEADERS = {
'User-Agent': 'connectd/1.0 (finding builders with aligned values)',
'Accept': 'application/json',
}
def log(msg):
print(f" forges: {msg}")
# === GITEA/FORGEJO/GOGS API ===
# these share the same API structure
def scrape_gitea_users(instance_url: str, limit: int = 100) -> List[Dict]:
"""
scrape users from a gitea/forgejo/gogs instance.
uses the explore/users page or API if available.
"""
users = []
# try API first (gitea 1.x+)
try:
api_url = f"{instance_url}/api/v1/users/search"
params = {'q': '', 'limit': min(limit, 50)}
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=15)
if resp.status_code == 200:
data = resp.json()
user_list = data.get('data', []) or data.get('users', []) or data
if isinstance(user_list, list):
for u in user_list[:limit]:
users.append({
'username': u.get('login') or u.get('username'),
'full_name': u.get('full_name'),
'avatar': u.get('avatar_url'),
'website': u.get('website'),
'location': u.get('location'),
'bio': u.get('description') or u.get('bio'),
})
log(f" got {len(users)} users via API")
except Exception as e:
log(f" API failed: {e}")
# fallback: scrape explore page
if not users:
try:
explore_url = f"{instance_url}/explore/users"
resp = requests.get(explore_url, headers=HEADERS, timeout=15)
if resp.status_code == 200:
# parse HTML for usernames
usernames = re.findall(r'href="/([^/"]+)"[^>]*class="[^"]*user[^"]*"', resp.text)
usernames += re.findall(r'<a[^>]+href="/([^/"]+)"[^>]*title="[^"]*"', resp.text)
usernames = list(set(usernames))[:limit]
for username in usernames:
if username and not username.startswith(('explore', 'api', 'user', 'repo')):
users.append({'username': username})
log(f" got {len(users)} users via scrape")
except Exception as e:
log(f" scrape failed: {e}")
return users
def get_gitea_user_details(instance_url: str, username: str) -> Optional[Dict]:
"""get detailed user info from gitea/forgejo/gogs"""
try:
# API endpoint
api_url = f"{instance_url}/api/v1/users/{username}"
resp = requests.get(api_url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
u = resp.json()
return {
'username': u.get('login') or u.get('username'),
'full_name': u.get('full_name'),
'email': u.get('email'), # may be hidden
'website': u.get('website'),
'location': u.get('location'),
'bio': u.get('description') or u.get('bio'),
'created': u.get('created'),
'followers': u.get('followers_count', 0),
'following': u.get('following_count', 0),
}
except:
pass
return None
def get_gitea_user_repos(instance_url: str, username: str, limit: int = 10) -> List[Dict]:
"""get user's repos from gitea/forgejo/gogs"""
repos = []
try:
api_url = f"{instance_url}/api/v1/users/{username}/repos"
resp = requests.get(api_url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
for r in resp.json()[:limit]:
repos.append({
'name': r.get('name'),
'full_name': r.get('full_name'),
'description': r.get('description'),
'stars': r.get('stars_count', 0),
'forks': r.get('forks_count', 0),
'language': r.get('language'),
'updated': r.get('updated_at'),
})
except:
pass
return repos
# === GITLAB CE API ===
def scrape_gitlab_users(instance_url: str, limit: int = 100) -> List[Dict]:
"""scrape users from a gitlab ce instance"""
users = []
try:
# gitlab API - public users endpoint
api_url = f"{instance_url}/api/v4/users"
params = {'per_page': min(limit, 100), 'active': True}
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=15)
if resp.status_code == 200:
for u in resp.json()[:limit]:
users.append({
'username': u.get('username'),
'full_name': u.get('name'),
'avatar': u.get('avatar_url'),
'website': u.get('website_url'),
'location': u.get('location'),
'bio': u.get('bio'),
'public_email': u.get('public_email'),
})
log(f" got {len(users)} gitlab users")
except Exception as e:
log(f" gitlab API failed: {e}")
return users
def get_gitlab_user_details(instance_url: str, username: str) -> Optional[Dict]:
"""get detailed gitlab user info"""
try:
api_url = f"{instance_url}/api/v4/users"
params = {'username': username}
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=10)
if resp.status_code == 200:
users = resp.json()
if users:
u = users[0]
return {
'username': u.get('username'),
'full_name': u.get('name'),
'email': u.get('public_email'),
'website': u.get('website_url'),
'location': u.get('location'),
'bio': u.get('bio'),
'created': u.get('created_at'),
}
except:
pass
return None
def get_gitlab_user_projects(instance_url: str, username: str, limit: int = 10) -> List[Dict]:
"""get user's projects from gitlab"""
repos = []
try:
# first get user id
api_url = f"{instance_url}/api/v4/users"
params = {'username': username}
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=10)
if resp.status_code == 200 and resp.json():
user_id = resp.json()[0].get('id')
# get projects
proj_url = f"{instance_url}/api/v4/users/{user_id}/projects"
resp = requests.get(proj_url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
for p in resp.json()[:limit]:
repos.append({
'name': p.get('name'),
'full_name': p.get('path_with_namespace'),
'description': p.get('description'),
'stars': p.get('star_count', 0),
'forks': p.get('forks_count', 0),
'updated': p.get('last_activity_at'),
})
except:
pass
return repos
# === SOURCEHUT API ===
def scrape_sourcehut_users(limit: int = 100) -> List[Dict]:
"""
scrape users from sourcehut.
sourcehut doesn't have a public user list, so we scrape from:
- recent commits
- mailing lists
- project pages
"""
users = []
seen = set()
try:
# scrape from git.sr.ht explore
resp = requests.get('https://git.sr.ht/projects', headers=HEADERS, timeout=15)
if resp.status_code == 200:
# extract usernames from repo paths like ~username/repo
usernames = re.findall(r'href="/~([^/"]+)', resp.text)
for username in usernames:
if username not in seen:
seen.add(username)
users.append({'username': username})
if len(users) >= limit:
break
log(f" got {len(users)} sourcehut users")
except Exception as e:
log(f" sourcehut scrape failed: {e}")
return users
def get_sourcehut_user_details(username: str) -> Optional[Dict]:
"""get sourcehut user details"""
try:
# scrape profile page
profile_url = f"https://sr.ht/~{username}"
resp = requests.get(profile_url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
bio = ''
# extract bio from page
bio_match = re.search(r'<div class="container">\s*<p>([^<]+)</p>', resp.text)
if bio_match:
bio = bio_match.group(1).strip()
return {
'username': username,
'bio': bio,
'profile_url': profile_url,
}
except:
pass
return None
def get_sourcehut_user_repos(username: str, limit: int = 10) -> List[Dict]:
"""get sourcehut user's repos"""
repos = []
try:
git_url = f"https://git.sr.ht/~{username}"
resp = requests.get(git_url, headers=HEADERS, timeout=10)
if resp.status_code == 200:
# extract repo names
repo_matches = re.findall(rf'href="/~{username}/([^"]+)"', resp.text)
for repo in repo_matches[:limit]:
if repo and not repo.startswith(('refs', 'log', 'tree')):
repos.append({
'name': repo,
'full_name': f"~{username}/{repo}",
})
except:
pass
return repos
# === UNIFIED SCRAPER ===
def scrape_forge(instance_name: str, instance_url: str, platform_type: str, limit: int = 50) -> List[Dict]:
"""
scrape users from any forge type.
returns list of human dicts ready for database.
"""
log(f"scraping {instance_name} ({platform_type})...")
humans = []
# get user list based on platform type
if platform_type in ('gitea', 'forgejo', 'gogs'):
users = scrape_gitea_users(instance_url, limit)
get_details = lambda u: get_gitea_user_details(instance_url, u)
get_repos = lambda u: get_gitea_user_repos(instance_url, u)
elif platform_type == 'gitlab':
users = scrape_gitlab_users(instance_url, limit)
get_details = lambda u: get_gitlab_user_details(instance_url, u)
get_repos = lambda u: get_gitlab_user_projects(instance_url, u)
elif platform_type == 'sourcehut':
users = scrape_sourcehut_users(limit)
get_details = get_sourcehut_user_details
get_repos = get_sourcehut_user_repos
else:
log(f" unknown platform type: {platform_type}")
return []
for user in users:
username = user.get('username')
if not username:
continue
time.sleep(REQUEST_DELAY)
# get detailed info
details = get_details(username)
if details:
user.update(details)
# get repos
repos = get_repos(username)
# build human record
bio = user.get('bio', '') or ''
website = user.get('website', '') or ''
# analyze signals from bio
score, signals, reasons = analyze_text(bio + ' ' + website)
# BOOST: self-hosted git = highest signal
score += 25
signals.append('selfhosted_git')
reasons.append(f'uses self-hosted git ({instance_name})')
# extract contact info
contact = {}
email = user.get('email') or user.get('public_email')
if email and '@' in email:
contact['email'] = email
if website:
contact['website'] = website
# build human dict
human = {
'platform': f'{platform_type}:{instance_name}',
'username': username,
'name': user.get('full_name'),
'bio': bio,
'url': f"{instance_url}/{username}" if platform_type != 'sourcehut' else f"https://sr.ht/~{username}",
'score': score,
'signals': json.dumps(signals),
'reasons': json.dumps(reasons),
'contact': json.dumps(contact),
'extra': json.dumps({
'instance': instance_name,
'instance_url': instance_url,
'platform_type': platform_type,
'repos': repos[:5],
'followers': user.get('followers', 0),
'email': email,
'website': website,
}),
'user_type': 'builder' if repos else 'none',
}
humans.append(human)
log(f" {username}: score={score}, repos={len(repos)}")
return humans
def scrape_all_forges(limit_per_instance: int = 30) -> List[Dict]:
"""scrape all known forge instances"""
all_humans = []
for instance_name, instance_url, platform_type in KNOWN_INSTANCES:
try:
humans = scrape_forge(instance_name, instance_url, platform_type, limit_per_instance)
all_humans.extend(humans)
log(f" {instance_name}: {len(humans)} humans")
except Exception as e:
log(f" {instance_name} failed: {e}")
time.sleep(2) # be nice between instances
log(f"total: {len(all_humans)} humans from {len(KNOWN_INSTANCES)} forges")
return all_humans
# === OUTREACH METHODS ===
def can_message_on_forge(instance_url: str, platform_type: str) -> bool:
"""check if we can send messages on this forge"""
# gitea/forgejo don't have DMs
# gitlab has merge request comments
# sourcehut has mailing lists
return platform_type in ('gitlab', 'sourcehut')
def open_forge_issue(instance_url: str, platform_type: str,
owner: str, repo: str, title: str, body: str) -> Tuple[bool, str]:
"""
open an issue on a forge as outreach method.
requires API token for authenticated requests.
"""
# would need tokens per instance - for now return False
# this is a fallback method, email is preferred
return False, "forge issue creation not implemented yet"
# === DISCOVERY ===
def discover_forge_instances() -> List[Tuple[str, str, str]]:
"""
discover new forge instances from:
- fediverse (they often announce)
- known lists
- DNS patterns
returns list of (name, url, platform_type)
"""
# start with known instances
instances = list(KNOWN_INSTANCES)
# could add discovery logic here:
# - scrape https://codeberg.org/forgejo/forgejo/issues for instance mentions
# - check fediverse for git.* domains
# - crawl gitea/forgejo awesome lists
return instances
if __name__ == '__main__':
# test
print("testing forge scrapers...")
# test codeberg
humans = scrape_forge('codeberg', 'https://codeberg.org', 'gitea', limit=5)
print(f"codeberg: {len(humans)} humans")
for h in humans[:2]:
print(f" {h['username']}: {h['score']} - {h.get('signals')}")

View file

@ -246,9 +246,11 @@ def analyze_github_user(login):
'repo_count': len(repos),
'total_stars': total_stars,
'hireable': user.get('hireable', False),
'top_repos': [{'name': r.get('name'), 'description': r.get('description'), 'stars': r.get('stargazers_count', 0), 'language': r.get('language')} for r in repos[:5] if not r.get('fork')],
'handles': handles, # all discovered handles
},
'hireable': user.get('hireable', False),
'top_repos': [{'name': r.get('name'), 'description': r.get('description'), 'stars': r.get('stargazers_count', 0), 'language': r.get('language')} for r in repos[:5] if not r.get('fork')],
'scraped_at': datetime.now().isoformat(),
# lost builder fields
'lost_potential_score': lost_potential_score,

View file

@ -103,6 +103,15 @@ PLATFORM_PATTERNS = {
'devto': [
(r'https?://dev\.to/([^/?#]+)', lambda m: m.group(1)),
],
# reddit/lobsters
'reddit': [
(r'https?://(?:www\.)?reddit\.com/u(?:ser)?/([^/?#]+)', lambda m: f"u/{m.group(1)}"),
(r'https?://(?:old|new)\.reddit\.com/u(?:ser)?/([^/?#]+)', lambda m: f"u/{m.group(1)}"),
],
'lobsters': [
(r'https?://lobste\.rs/u/([^/?#]+)', lambda m: m.group(1)),
],
# funding
'kofi': [

View file

@ -1,24 +1,14 @@
"""
scoutd/reddit.py - reddit discovery (DISCOVERY ONLY, NOT OUTREACH)
scoutd/reddit.py - reddit discovery with TAVILY web search
reddit is a SIGNAL SOURCE, not a contact channel.
flow:
1. scrape reddit for users active in target subs
2. extract their reddit profile
3. look for links TO other platforms (github, mastodon, website, etc.)
4. add to scout database with reddit as signal source
5. reach out via their OTHER platforms, never reddit
if reddit user has no external links:
- add to manual_queue with note "reddit-only, needs manual review"
also detects lost builders - stuck in learnprogramming for years, imposter syndrome, etc.
CRITICAL: always quote usernames in tavily searches to avoid fuzzy matching
"""
import requests
import json
import time
import re
import os
from datetime import datetime
from pathlib import Path
from collections import defaultdict
@ -35,43 +25,14 @@ from .lost import (
HEADERS = {'User-Agent': 'connectd:v1.0 (community discovery)'}
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'reddit'
# patterns for extracting external platform links
PLATFORM_PATTERNS = {
'github': [
r'github\.com/([a-zA-Z0-9_-]+)',
r'gh:\s*@?([a-zA-Z0-9_-]+)',
],
'mastodon': [
r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})',
r'mastodon\.social/@([a-zA-Z0-9_]+)',
r'fosstodon\.org/@([a-zA-Z0-9_]+)',
r'hachyderm\.io/@([a-zA-Z0-9_]+)',
r'tech\.lgbt/@([a-zA-Z0-9_]+)',
],
'twitter': [
r'twitter\.com/([a-zA-Z0-9_]+)',
r'x\.com/([a-zA-Z0-9_]+)',
r'(?:^|\s)@([a-zA-Z0-9_]{1,15})(?:\s|$)', # bare @handle
],
'bluesky': [
r'bsky\.app/profile/([a-zA-Z0-9_.-]+)',
r'([a-zA-Z0-9_-]+)\.bsky\.social',
],
'website': [
r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)',
],
'matrix': [
r'@([a-zA-Z0-9_-]+):([a-zA-Z0-9.-]+)',
],
}
GITHUB_TOKEN = os.getenv('GITHUB_TOKEN')
TAVILY_API_KEY = os.getenv('TAVILY_API_KEY', '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')
def _api_get(url, params=None):
"""rate-limited request"""
def _api_get(url, params=None, headers=None):
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
CACHE_DIR.mkdir(parents=True, exist_ok=True)
if cache_file.exists():
try:
data = json.loads(cache_file.read_text())
@ -79,142 +40,263 @@ def _api_get(url, params=None):
return data.get('_data')
except:
pass
time.sleep(2) # reddit rate limit
time.sleep(1)
req_headers = {**HEADERS, **(headers or {})}
try:
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
resp = requests.get(url, headers=req_headers, params=params, timeout=30)
resp.raise_for_status()
result = resp.json()
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
return result
except requests.exceptions.RequestException as e:
print(f" reddit api error: {e}")
except:
return None
def extract_external_links(text):
"""extract links to other platforms from text"""
links = {}
def tavily_search(query, max_results=10):
if not TAVILY_API_KEY:
return []
try:
resp = requests.post(
'https://api.tavily.com/search',
json={'api_key': TAVILY_API_KEY, 'query': query, 'max_results': max_results},
timeout=30
)
if resp.status_code == 200:
return resp.json().get('results', [])
except Exception as e:
print(f" tavily error: {e}")
return []
def extract_links_from_text(text, username=None):
found = {}
if not text:
return links
return found
text_lower = text.lower()
username_lower = username.lower() if username else None
for platform, patterns in PLATFORM_PATTERNS.items():
for pattern in patterns:
matches = re.findall(pattern, text, re.IGNORECASE)
if matches:
if platform == 'mastodon' and isinstance(matches[0], tuple):
# full fediverse handle
links[platform] = f"@{matches[0][0]}@{matches[0][1]}"
elif platform == 'matrix' and isinstance(matches[0], tuple):
links[platform] = f"@{matches[0][0]}:{matches[0][1]}"
elif platform == 'website':
# skip reddit/imgur/etc
for match in matches:
if not any(x in match.lower() for x in ['reddit', 'imgur', 'redd.it', 'i.redd']):
links[platform] = f"https://{match}"
# email
for email in re.findall(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', text):
if any(x in email.lower() for x in ['noreply', 'example', '@reddit', 'info@', 'support@', 'contact@', 'admin@']):
continue
if username_lower and username_lower in email.lower():
found['email'] = email
break
else:
links[platform] = matches[0]
if 'email' not in found:
found['email'] = email
# github
for gh in re.findall(r'github\.com/([a-zA-Z0-9_-]+)', text):
if gh.lower() in ['topics', 'explore', 'trending', 'sponsors', 'orgs']:
continue
if username_lower and gh.lower() == username_lower:
found['github'] = gh
break
return links
# mastodon
masto = re.search(r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})', text)
if masto:
found['mastodon'] = f"@{masto.group(1)}@{masto.group(2)}"
for inst in ['mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt']:
m = re.search(f'{inst}/@([a-zA-Z0-9_]+)', text)
if m:
found['mastodon'] = f"@{m.group(1)}@{inst}"
break
# bluesky
bsky = re.search(r'bsky\.app/profile/([a-zA-Z0-9_.-]+)', text)
if bsky:
found['bluesky'] = bsky.group(1)
# twitter
tw = re.search(r'(?:twitter|x)\.com/([a-zA-Z0-9_]+)', text)
if tw and tw.group(1).lower() not in ['home', 'explore', 'search']:
found['twitter'] = tw.group(1)
# linkedin
li = re.search(r'linkedin\.com/in/([a-zA-Z0-9_-]+)', text)
if li:
found['linkedin'] = f"https://linkedin.com/in/{li.group(1)}"
# twitch
twitch = re.search(r'twitch\.tv/([a-zA-Z0-9_]+)', text)
if twitch:
found['twitch'] = f"https://twitch.tv/{twitch.group(1)}"
# itch.io
itch = re.search(r'itch\.io/profile/([a-zA-Z0-9_-]+)', text)
if itch:
found['itch'] = f"https://itch.io/profile/{itch.group(1)}"
# website
for url in re.findall(r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)', text):
skip = ['reddit', 'imgur', 'google', 'facebook', 'twitter', 'youtube', 'wikipedia', 'amazon']
if not any(x in url.lower() for x in skip):
if username_lower and username_lower in url.lower():
found['website'] = f"https://{url}"
break
if 'website' not in found:
found['website'] = f"https://{url}"
return found
def cross_platform_discovery(username, full_text=''):
"""
search the ENTIRE internet using TAVILY.
CRITICAL: always quote username to avoid fuzzy matching!
"""
found = {}
all_content = full_text
username_lower = username.lower()
print(f" 🔍 cross-platform search for {username}...")
# ALWAYS QUOTE THE USERNAME - critical for exact matching
searches = [
f'"{username}"', # just username, quoted
f'"{username}" github', # github
f'"{username}" developer programmer', # dev context
f'"{username}" email contact', # contact
f'"{username}" mastodon', # fediverse
]
for query in searches:
print(f" 🌐 tavily: {query}")
results = tavily_search(query, max_results=5)
for result in results:
url = result.get('url', '').lower()
title = result.get('title', '')
content = result.get('content', '')
combined = f"{url} {title} {content}"
# validate username appears
if username_lower not in combined.lower():
continue
all_content += f" {combined}"
# extract from URL directly
if f'github.com/{username_lower}' in url and not found.get('github'):
found['github'] = username
print(f" ✓ github: {username}")
if f'twitch.tv/{username_lower}' in url and not found.get('twitch'):
found['twitch'] = f"https://twitch.tv/{username}"
print(f" ✓ twitch")
if 'itch.io/profile/' in url and username_lower in url and not found.get('itch'):
found['itch'] = url if url.startswith('http') else f"https://{url}"
print(f" ✓ itch.io")
if 'linkedin.com/in/' in url and not found.get('linkedin'):
li = re.search(r'linkedin\.com/in/([a-zA-Z0-9_-]+)', url)
if li:
found['linkedin'] = f"https://linkedin.com/in/{li.group(1)}"
print(f" ✓ linkedin")
# extract from content
extracted = extract_links_from_text(all_content, username)
for k, v in extracted.items():
if k not in found:
found[k] = v
print(f"{k}")
# good contact found? stop searching
if found.get('email') or found.get('github') or found.get('mastodon') or found.get('twitch'):
break
# === API CHECKS ===
if not found.get('github'):
headers = {'Authorization': f'token {GITHUB_TOKEN}'} if GITHUB_TOKEN else {}
try:
resp = requests.get(f'https://api.github.com/users/{username}', headers=headers, timeout=10)
if resp.status_code == 200:
data = resp.json()
found['github'] = username
print(f" ✓ github API")
if data.get('email') and 'email' not in found:
found['email'] = data['email']
if data.get('blog') and 'website' not in found:
found['website'] = data['blog'] if data['blog'].startswith('http') else f"https://{data['blog']}"
except:
pass
if not found.get('mastodon'):
for inst in ['mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt']:
try:
resp = requests.get(f'https://{inst}/api/v1/accounts/lookup', params={'acct': username}, timeout=5)
if resp.status_code == 200:
found['mastodon'] = f"@{username}@{inst}"
print(f" ✓ mastodon: {found['mastodon']}")
break
except:
continue
if not found.get('bluesky'):
try:
resp = requests.get('https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile',
params={'actor': f'{username}.bsky.social'}, timeout=10)
if resp.status_code == 200:
found['bluesky'] = resp.json().get('handle')
print(f" ✓ bluesky")
except:
pass
return found
def get_user_profile(username):
"""get user profile including bio/description"""
url = f'https://www.reddit.com/user/{username}/about.json'
data = _api_get(url)
if not data or 'data' not in data:
return None
profile = data['data']
return {
'username': username,
'name': profile.get('name'),
'bio': profile.get('subreddit', {}).get('public_description', ''),
'title': profile.get('subreddit', {}).get('title', ''),
'icon': profile.get('icon_img'),
'created_utc': profile.get('created_utc'),
'total_karma': profile.get('total_karma', 0),
'link_karma': profile.get('link_karma', 0),
'comment_karma': profile.get('comment_karma', 0),
}
def get_subreddit_users(subreddit, limit=100):
"""get recent posters/commenters from a subreddit"""
users = set()
# posts
url = f'https://www.reddit.com/r/{subreddit}/new.json'
for endpoint in ['new', 'comments']:
url = f'https://www.reddit.com/r/{subreddit}/{endpoint}.json'
data = _api_get(url, {'limit': limit})
if data and 'data' in data:
for post in data['data'].get('children', []):
author = post['data'].get('author')
for item in data['data'].get('children', []):
author = item['data'].get('author')
if author and author not in ['[deleted]', 'AutoModerator']:
users.add(author)
# comments
url = f'https://www.reddit.com/r/{subreddit}/comments.json'
data = _api_get(url, {'limit': limit})
if data and 'data' in data:
for comment in data['data'].get('children', []):
author = comment['data'].get('author')
if author and author not in ['[deleted]', 'AutoModerator']:
users.add(author)
return users
def get_user_activity(username):
"""get user's posts and comments"""
activity = []
# posts
url = f'https://www.reddit.com/user/{username}/submitted.json'
for endpoint in ['submitted', 'comments']:
url = f'https://www.reddit.com/user/{username}/{endpoint}.json'
data = _api_get(url, {'limit': 100})
if data and 'data' in data:
for post in data['data'].get('children', []):
for item in data['data'].get('children', []):
activity.append({
'type': 'post',
'subreddit': post['data'].get('subreddit'),
'title': post['data'].get('title', ''),
'body': post['data'].get('selftext', ''),
'score': post['data'].get('score', 0),
'type': 'post' if endpoint == 'submitted' else 'comment',
'subreddit': item['data'].get('subreddit'),
'title': item['data'].get('title', ''),
'body': item['data'].get('selftext', '') or item['data'].get('body', ''),
'score': item['data'].get('score', 0),
})
# comments
url = f'https://www.reddit.com/user/{username}/comments.json'
data = _api_get(url, {'limit': 100})
if data and 'data' in data:
for comment in data['data'].get('children', []):
activity.append({
'type': 'comment',
'subreddit': comment['data'].get('subreddit'),
'body': comment['data'].get('body', ''),
'score': comment['data'].get('score', 0),
})
return activity
def analyze_reddit_user(username):
"""
analyze a reddit user for alignment and extract external platform links.
reddit is DISCOVERY ONLY - we find users here but contact them elsewhere.
"""
activity = get_user_activity(username)
if not activity:
return None
# get profile for bio
profile = get_user_profile(username)
# count subreddit activity
sub_activity = defaultdict(int)
text_parts = []
total_karma = 0
@ -232,20 +314,16 @@ def analyze_reddit_user(username):
full_text = ' '.join(text_parts)
text_score, positive_signals, negative_signals = analyze_text(full_text)
# EXTRACT EXTERNAL LINKS - this is the key part
# check profile bio first
external_links = {}
if profile:
bio_text = f"{profile.get('bio', '')} {profile.get('title', '')}"
external_links.update(extract_external_links(bio_text))
external_links.update(extract_links_from_text(f"{profile.get('bio', '')} {profile.get('title', '')}", username))
external_links.update(extract_links_from_text(full_text, username))
# also scan posts/comments for links (people often share their github etc)
activity_links = extract_external_links(full_text)
for platform, link in activity_links.items():
if platform not in external_links:
external_links[platform] = link
# TAVILY search
discovered = cross_platform_discovery(username, full_text)
external_links.update(discovered)
# subreddit scoring
# scoring
sub_score = 0
aligned_subs = []
for sub, count in sub_activity.items():
@ -254,13 +332,11 @@ def analyze_reddit_user(username):
sub_score += weight * min(count, 5)
aligned_subs.append(sub)
# multi-sub bonus
if len(aligned_subs) >= 5:
sub_score += 30
elif len(aligned_subs) >= 3:
sub_score += 15
# negative sub penalty
for sub in sub_activity:
if sub.lower() in [n.lower() for n in NEGATIVE_SUBREDDITS]:
sub_score -= 50
@ -268,77 +344,33 @@ def analyze_reddit_user(username):
total_score = text_score + sub_score
# bonus if they have external links (we can actually contact them)
if external_links.get('github'):
total_score += 10
positive_signals.append('has github')
positive_signals.append('github')
if external_links.get('mastodon'):
total_score += 10
positive_signals.append('has mastodon')
if external_links.get('website'):
positive_signals.append('mastodon')
if external_links.get('email'):
total_score += 15
positive_signals.append('email')
if external_links.get('twitch'):
total_score += 5
positive_signals.append('has website')
positive_signals.append('twitch')
# === LOST BUILDER DETECTION ===
# reddit is HIGH SIGNAL for lost builders - stuck in learnprogramming,
# imposter syndrome posts, "i wish i could" language, etc.
# lost builder
subreddits_list = list(sub_activity.keys())
lost_signals, lost_weight = analyze_reddit_for_lost_signals(activity, subreddits_list)
# also check full text for lost patterns (already done partially in analyze_reddit_for_lost_signals)
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
text_lost_signals, _ = analyze_text_for_lost_signals(full_text)
for sig in text_lost_signals:
if sig not in lost_signals:
lost_signals.append(sig)
lost_weight += text_lost_weight
lost_potential_score = lost_weight
builder_activity = 20 if external_links.get('github') else 0
user_type = classify_user(lost_weight, builder_activity, total_score)
# classify: builder, lost, both, or none
# for reddit, builder_score is based on having external links + high karma
builder_activity = 0
if external_links.get('github'):
builder_activity += 20
if total_karma > 1000:
builder_activity += 15
elif total_karma > 500:
builder_activity += 10
confidence = min(0.95, 0.3 + (0.2 if len(activity) > 20 else 0) + (0.2 if len(aligned_subs) >= 2 else 0) + (0.1 if external_links else 0))
user_type = classify_user(lost_potential_score, builder_activity, total_score)
# confidence
confidence = 0.3
if len(activity) > 20:
confidence += 0.2
if len(aligned_subs) >= 2:
confidence += 0.2
if len(text_parts) > 10:
confidence += 0.2
# higher confidence if we have contact methods
if external_links:
confidence += 0.1
confidence = min(confidence, 0.95)
reasons = []
if aligned_subs:
reasons.append(f"active in: {', '.join(aligned_subs[:5])}")
if positive_signals:
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
if negative_signals:
reasons.append(f"WARNING: {', '.join(negative_signals)}")
if external_links:
reasons.append(f"external: {', '.join(external_links.keys())}")
# add lost reasons if applicable
if user_type == 'lost' or user_type == 'both':
lost_descriptions = get_signal_descriptions(lost_signals)
if lost_descriptions:
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
# determine if this is reddit-only (needs manual review)
reddit_only = len(external_links) == 0
if reddit_only:
reasons.append("REDDIT-ONLY: needs manual review for outreach")
reddit_only = not any([external_links.get(k) for k in ['github', 'mastodon', 'bluesky', 'email', 'matrix', 'linkedin', 'twitch', 'itch']])
return {
'platform': 'reddit',
@ -351,153 +383,46 @@ def analyze_reddit_user(username):
'subreddits': aligned_subs,
'activity_count': len(activity),
'karma': total_karma,
'reasons': reasons,
'reasons': [f"contact: {', '.join(external_links.keys())}"] if external_links else [],
'scraped_at': datetime.now().isoformat(),
# external platform links for outreach
'external_links': external_links,
'reddit_only': reddit_only,
'extra': {
'github': external_links.get('github'),
'mastodon': external_links.get('mastodon'),
'twitter': external_links.get('twitter'),
'bluesky': external_links.get('bluesky'),
'website': external_links.get('website'),
'matrix': external_links.get('matrix'),
'reddit_karma': total_karma,
'reddit_activity': len(activity),
},
# lost builder fields
'lost_potential_score': lost_potential_score,
'extra': external_links,
'lost_potential_score': lost_weight,
'lost_signals': lost_signals,
'user_type': user_type,
}
def scrape_reddit(db, limit_per_sub=50):
"""
full reddit scrape - DISCOVERY ONLY
finds aligned users, extracts external links for outreach.
reddit-only users go to manual queue.
"""
print("scoutd/reddit: starting scrape (discovery only, not outreach)...")
# find users in multiple aligned subs
print("scoutd/reddit: scraping (TAVILY enabled)...")
user_subs = defaultdict(set)
# aligned subs - active builders
priority_subs = ['intentionalcommunity', 'cohousing', 'selfhosted',
'homeassistant', 'solarpunk', 'cooperatives', 'privacy',
'localllama', 'homelab', 'degoogle', 'pihole', 'unraid']
# lost builder subs - people who need encouragement
# these folks might be stuck, but they have aligned interests
lost_subs = ['learnprogramming', 'findapath', 'getdisciplined',
'careerguidance', 'cscareerquestions', 'decidingtobebetter']
# scrape both - we want to find lost builders with aligned interests
all_subs = priority_subs + lost_subs
for sub in all_subs:
print(f" scraping r/{sub}...")
for sub in ['intentionalcommunity', 'cohousing', 'selfhosted', 'homeassistant', 'solarpunk', 'cooperatives', 'privacy', 'localllama', 'homelab', 'learnprogramming']:
users = get_subreddit_users(sub, limit=limit_per_sub)
for user in users:
user_subs[user].add(sub)
print(f" found {len(users)} users")
# filter for multi-sub users
multi_sub = {u: subs for u, subs in user_subs.items() if len(subs) >= 2}
print(f" {len(multi_sub)} users in 2+ aligned subs")
print(f" {len(multi_sub)} users in 2+ subs")
# analyze
results = []
reddit_only_count = 0
external_link_count = 0
builders_found = 0
lost_found = 0
for username in multi_sub:
try:
result = analyze_reddit_user(username)
if result and result['score'] > 0:
results.append(result)
db.save_human(result)
user_type = result.get('user_type', 'none')
# track lost builders - reddit is high signal for these
if user_type == 'lost':
lost_found += 1
lost_score = result.get('lost_potential_score', 0)
if lost_score >= 40:
print(f" 💔 u/{username}: lost_score={lost_score}, values={result['score']} pts")
# lost builders also go to manual queue if reddit-only
if result.get('reddit_only'):
_add_to_manual_queue(result)
elif user_type == 'builder':
builders_found += 1
elif user_type == 'both':
builders_found += 1
lost_found += 1
print(f" ⚡ u/{username}: recovering builder")
# track external links
if result.get('reddit_only'):
reddit_only_count += 1
# add high-value users to manual queue for review
if result['score'] >= 50 and user_type != 'lost': # lost already added above
_add_to_manual_queue(result)
print(f" 📋 u/{username}: {result['score']} pts (reddit-only → manual queue)")
else:
external_link_count += 1
if result['score'] >= 50 and user_type == 'builder':
links = list(result.get('external_links', {}).keys())
print(f" ★ u/{username}: {result['score']} pts → {', '.join(links)}")
except Exception as e:
print(f" error on {username}: {e}")
print(f" error: {username}: {e}")
print(f"scoutd/reddit: found {len(results)} aligned humans")
print(f" - {builders_found} active builders")
print(f" - {lost_found} lost builders (need encouragement)")
print(f" - {external_link_count} with external links (reachable)")
print(f" - {reddit_only_count} reddit-only (manual queue)")
print(f"scoutd/reddit: {len(results)} humans")
return results
def _add_to_manual_queue(result):
"""add reddit-only user to manual queue for review"""
from pathlib import Path
import json
queue_file = Path(__file__).parent.parent / 'data' / 'manual_queue.json'
queue_file.parent.mkdir(parents=True, exist_ok=True)
queue = []
if queue_file.exists():
try:
queue = json.loads(queue_file.read_text())
except:
pass
# check if already in queue
existing = [q for q in queue if q.get('username') == result['username'] and q.get('platform') == 'reddit']
if existing:
return
queue.append({
'platform': 'reddit',
'username': result['username'],
'url': result['url'],
'score': result['score'],
'subreddits': result.get('subreddits', []),
'signals': result.get('signals', []),
'reasons': result.get('reasons', []),
'note': 'reddit-only user - no external links found. DM manually if promising.',
'queued_at': datetime.now().isoformat(),
'status': 'pending',
})
queue = json.loads(queue_file.read_text()) if queue_file.exists() else []
if not any(q.get('username') == result['username'] for q in queue):
queue.append({'platform': 'reddit', 'username': result['username'], 'url': result['url'], 'score': result['score'], 'queued_at': datetime.now().isoformat()})
queue_file.write_text(json.dumps(queue, indent=2))

68
soul.txt Normal file
View file

@ -0,0 +1,68 @@
this is the soul for the ai that writes for connectd, each intro sent should have this basic content, but should make it unique and generate a clean, human way to say so.
"im connectd (or the connectd daemon) we found you
online and think that *factor* aligns really well
with the values of *person*...
this is not a sales maneuver, this is not a trick,
this is simply someone who wants to end the isolation
for our people and am doing it the only way i know how.
there is no financial gain or benefit to be had from
my reaching out to you, no data collected that we
couldn't find online..
simply a daemon made to reach out to people and bring
them together.
it sucks being alone. it sucks feeling isolated.
you are NOT alone and if you felt as if you were,
that means you're less alone than you thought-
go forth, open the door, it's right in front of you now.
all you have to do is step inside.
you're home now.
there's a better way and we are going to build it together."
you can reach *person* at *preffered contact method*
- connectd daemon
hope it goes well!
CONNECTD_ICONS (line 33-44):
CONNECTD_ICONS = '''<div style="display:flex;gap:16px;flex-wrap:wrap">
<a href="https://github.com/connectd-daemon" title="GitHub" style="color:#888"><svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg></a>
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color:#888">...</a>
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color:#888">...</a>
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color:#888">...</a>
<a href="https://discord.gg/connectd" title="Discord" style="color:#888">...</a>
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color:#888">...</a>
<a href="https://reddit.com/r/connectd" title="Reddit" style="color:#888">...</a>
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color:#888">...</a>
</div>'''
SIGNATURE_HTML (line 46-49):
SIGNATURE_HTML = f'''<div style="margin-top:24px;padding-top:16px;border-top:1px solid #333">
<div style="margin-bottom:12px"><a href="https://github.com/sudoxnym/connectd" style="color:#8b5cf6">github.com/sudoxnym/connectd</a> <span style="color:#666;font-size:12px">(main repo)</span></div>
{CONNECTD_ICONS}
</div>'''
SIGNATURE_PLAIN (line 51-61):
SIGNATURE_PLAIN = """
---
github.com/sudoxnym/connectd (main repo)
github: github.com/connectd-daemon
mastodon: @connectd@mastodon.sudoxreboot.com
bluesky: connectd.bsky.social
lemmy: lemmy.sudoxreboot.com/c/connectd
discord: discord.gg/connectd
matrix: @connectd:sudoxreboot.com
reddit: reddit.com/r/connectd
email: connectd@sudoxreboot.com
"""