mirror of
https://github.com/sudoxnym/connectd.git
synced 2026-04-14 19:46:30 +00:00
Compare commits
No commits in common. "v1.1.0" and "master" have entirely different histories.
72 changed files with 8185 additions and 8814 deletions
89
.env.example
89
.env.example
|
|
@ -1,58 +1,81 @@
|
|||
# connectd environment variables
|
||||
# copy to .env and fill in your values
|
||||
|
||||
# === REQUIRED FOR LLM DRAFTING ===
|
||||
# === REQUIRED ===
|
||||
GROQ_API_KEY=
|
||||
GROQ_MODEL=llama-3.1-70b-versatile
|
||||
GROQ_MODEL=llama-3.3-70b-versatile
|
||||
|
||||
# === DISCOVERY SOURCES ===
|
||||
# github (optional - works without token but rate limited)
|
||||
# === DISTRIBUTED MODE (optional) ===
|
||||
# for coordinating multiple connectd instances
|
||||
CONNECTD_CENTRAL_API=
|
||||
CONNECTD_API_KEY=
|
||||
CONNECTD_INSTANCE_ID=
|
||||
CONNECTD_INSTANCE_IP=
|
||||
|
||||
# === DISCOVERY: GITHUB ===
|
||||
# works without token but heavily rate limited
|
||||
GITHUB_TOKEN=
|
||||
|
||||
# mastodon (for DM delivery)
|
||||
# === DISCOVERY: FEDIVERSE ===
|
||||
MASTODON_TOKEN=
|
||||
MASTODON_INSTANCE=mastodon.social
|
||||
MASTODON_INSTANCE=
|
||||
|
||||
# bluesky (for DM delivery)
|
||||
BLUESKY_HANDLE=
|
||||
BLUESKY_APP_PASSWORD=
|
||||
|
||||
# matrix (for DM delivery)
|
||||
MATRIX_HOMESERVER=
|
||||
MATRIX_USER_ID=
|
||||
MATRIX_ACCESS_TOKEN=
|
||||
|
||||
# discord (for discovery + DM delivery)
|
||||
DISCORD_BOT_TOKEN=
|
||||
DISCORD_TARGET_SERVERS= # comma separated server IDs
|
||||
|
||||
# lemmy (for authenticated access to your instance)
|
||||
LEMMY_INSTANCE=
|
||||
LEMMY_USERNAME=
|
||||
LEMMY_PASSWORD=
|
||||
|
||||
# === EMAIL DELIVERY ===
|
||||
# === DISCOVERY: OTHER ===
|
||||
DISCORD_BOT_TOKEN=
|
||||
DISCORD_TARGET_SERVERS=
|
||||
|
||||
# === DELIVERY: EMAIL ===
|
||||
SMTP_HOST=
|
||||
SMTP_PORT=465
|
||||
SMTP_USER=
|
||||
SMTP_PASS=
|
||||
FROM_EMAIL=connectd <connectd@yourdomain.com>
|
||||
FROM_EMAIL=
|
||||
|
||||
# === DELIVERY: SOCIAL ===
|
||||
# mastodon - reuses discovery token above
|
||||
# MASTODON_TOKEN=
|
||||
# MASTODON_INSTANCE=
|
||||
|
||||
BLUESKY_HANDLE=
|
||||
BLUESKY_APP_PASSWORD=
|
||||
|
||||
MATRIX_HOMESERVER=
|
||||
MATRIX_USER_ID=
|
||||
MATRIX_ACCESS_TOKEN=
|
||||
|
||||
# === DELIVERY: FORGE ISSUES ===
|
||||
# for creating issues on self-hosted git forges
|
||||
# highest signal outreach - these people actually selfhost
|
||||
|
||||
# codeberg (largest public gitea instance)
|
||||
CODEBERG_TOKEN=
|
||||
|
||||
# gitea/forgejo instances - format: GITEA_TOKEN_<host_with_underscores>=token
|
||||
# examples:
|
||||
# GITEA_TOKEN_git_example_com=your-token
|
||||
# GITEA_TOKEN_192_168_1_8_3000=your-token
|
||||
|
||||
# gitlab CE instances - format: GITLAB_TOKEN_<host_with_underscores>=token
|
||||
# examples:
|
||||
# GITLAB_TOKEN_gitlab_example_com=your-token
|
||||
|
||||
# === HOST USER CONFIG ===
|
||||
# the person running connectd - gets priority matching
|
||||
# set HOST_USER to your github username and connectd will auto-discover your info
|
||||
# other vars override/supplement discovered values
|
||||
# you - gets priority matching and appears in intros
|
||||
HOST_USER=
|
||||
HOST_NAME=
|
||||
HOST_EMAIL=
|
||||
HOST_GITHUB= # defaults to HOST_USER
|
||||
HOST_MASTODON= # format: @user@instance
|
||||
HOST_GITHUB=
|
||||
HOST_MASTODON=
|
||||
HOST_REDDIT=
|
||||
HOST_LEMMY= # format: @user@instance
|
||||
HOST_LEMMY=
|
||||
HOST_LOBSTERS=
|
||||
HOST_MATRIX= # format: @user:server
|
||||
HOST_DISCORD= # user id
|
||||
HOST_BLUESKY= # format: handle.bsky.social
|
||||
HOST_MATRIX=
|
||||
HOST_DISCORD=
|
||||
HOST_BLUESKY=
|
||||
HOST_LOCATION=
|
||||
HOST_INTERESTS= # comma separated: intentional-community,cooperative,solarpunk
|
||||
HOST_LOOKING_FOR= # what you're looking for in matches
|
||||
HOST_INTERESTS=
|
||||
HOST_LOOKING_FOR=
|
||||
|
|
|
|||
98
README.md
98
README.md
|
|
@ -18,11 +18,11 @@ we lift them up. we show them what's possible. we connect them to people who GET
|
|||
|
||||
## what it does
|
||||
|
||||
1. **scouts** - discovers humans across platforms (github, reddit, mastodon, lemmy, discord, lobsters, bluesky, matrix)
|
||||
1. **scouts** - discovers humans across platforms (github, mastodon, lemmy, reddit, lobsters, bluesky, matrix, discord, and self-hosted git forges)
|
||||
2. **analyzes** - scores them for values alignment AND lost builder potential
|
||||
3. **matches** - pairs aligned builders together, or pairs lost builders with inspiring active ones
|
||||
4. **drafts** - uses LLM to write genuine, personalized intros
|
||||
5. **delivers** - sends via email, mastodon DM, bluesky DM, matrix DM, discord DM, or github issue
|
||||
5. **delivers** - sends via the channel they're most active on (email, mastodon, bluesky, matrix, discord, github issue, or forge issue)
|
||||
|
||||
fully autonomous. no manual review. self-sustaining pipe.
|
||||
|
||||
|
|
@ -45,6 +45,40 @@ people who have potential but haven't started yet, gave up, or are struggling:
|
|||
|
||||
lost builders don't get matched to each other (both need energy). they get matched to ACTIVE builders who can inspire them.
|
||||
|
||||
## discovery sources
|
||||
|
||||
| platform | method |
|
||||
|----------|--------|
|
||||
| github | API + profile scraping |
|
||||
| mastodon | public API |
|
||||
| lemmy | federation API |
|
||||
| reddit | public API |
|
||||
| lobsters | web scraping |
|
||||
| bluesky | AT Protocol |
|
||||
| matrix | room membership |
|
||||
| discord | bot API |
|
||||
| **gitea/forgejo** | instance API |
|
||||
| **gitlab CE** | instance API |
|
||||
| **gogs** | instance API |
|
||||
| **sourcehut** | web scraping |
|
||||
| **codeberg** | gitea API |
|
||||
|
||||
self-hosted git forge users = highest signal. they actually selfhost.
|
||||
|
||||
## delivery methods
|
||||
|
||||
connectd picks the best contact method based on **activity** - not a static priority list. if someone's most active on mastodon, they get a mastodon DM. if that fails, it falls back to their second-most-active platform.
|
||||
|
||||
| method | notes |
|
||||
|--------|-------|
|
||||
| email | extracted from profiles, commits, websites |
|
||||
| mastodon DM | if they allow DMs |
|
||||
| bluesky DM | via AT Protocol |
|
||||
| matrix DM | creates DM room |
|
||||
| discord DM | via bot |
|
||||
| github issue | on their most active repo |
|
||||
| **forge issue** | gitea/forgejo/gitlab/gogs repos |
|
||||
|
||||
## quick start
|
||||
|
||||
```bash
|
||||
|
|
@ -71,7 +105,7 @@ python daemon.py # live mode
|
|||
# discovery
|
||||
python cli.py scout # all platforms
|
||||
python cli.py scout --github # github only
|
||||
python cli.py scout --reddit --lemmy # specific platforms
|
||||
python cli.py scout --forges # self-hosted git forges
|
||||
python cli.py scout --user octocat # deep scrape one user
|
||||
|
||||
# matching
|
||||
|
|
@ -97,27 +131,24 @@ python cli.py daemon --oneshot # run once then exit
|
|||
python cli.py status # show stats
|
||||
```
|
||||
|
||||
## docker
|
||||
## distributed mode
|
||||
|
||||
multiple connectd instances can coordinate via a central API to:
|
||||
- share discovered humans
|
||||
- avoid duplicate outreach
|
||||
- claim/release outreach targets
|
||||
|
||||
```bash
|
||||
# build
|
||||
docker build -t connectd .
|
||||
|
||||
# run daemon
|
||||
docker compose up -d
|
||||
|
||||
# run one-off commands
|
||||
docker compose run --rm connectd python cli.py scout
|
||||
docker compose run --rm connectd python cli.py status
|
||||
# set in .env
|
||||
CONNECTD_CENTRAL_API=https://your-central-api.com
|
||||
CONNECTD_API_KEY=your-api-key
|
||||
CONNECTD_INSTANCE_ID=instance-name
|
||||
CONNECTD_INSTANCE_IP=your-ip
|
||||
```
|
||||
|
||||
## environment variables
|
||||
|
||||
copy `.env.example` to `.env` and fill in your values:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
copy `.env.example` to `.env` and fill in your values.
|
||||
|
||||
### required
|
||||
|
||||
|
|
@ -132,7 +163,7 @@ cp .env.example .env
|
|||
| `GITHUB_TOKEN` | higher rate limits for github API |
|
||||
| `DISCORD_BOT_TOKEN` | discord bot token for server access |
|
||||
| `DISCORD_TARGET_SERVERS` | comma-separated server IDs to scout |
|
||||
| `LEMMY_INSTANCE` | your lemmy instance (e.g. `lemmy.ml`) |
|
||||
| `LEMMY_INSTANCE` | your lemmy instance |
|
||||
| `LEMMY_USERNAME` | lemmy username for auth |
|
||||
| `LEMMY_PASSWORD` | lemmy password for auth |
|
||||
|
||||
|
|
@ -141,11 +172,11 @@ cp .env.example .env
|
|||
| variable | description |
|
||||
|----------|-------------|
|
||||
| `MASTODON_TOKEN` | mastodon access token |
|
||||
| `MASTODON_INSTANCE` | your mastodon instance (e.g. `mastodon.social`) |
|
||||
| `BLUESKY_HANDLE` | bluesky handle (e.g. `you.bsky.social`) |
|
||||
| `MASTODON_INSTANCE` | your mastodon instance |
|
||||
| `BLUESKY_HANDLE` | bluesky handle |
|
||||
| `BLUESKY_APP_PASSWORD` | bluesky app password |
|
||||
| `MATRIX_HOMESERVER` | matrix homeserver URL |
|
||||
| `MATRIX_USER_ID` | matrix user ID (e.g. `@bot:matrix.org`) |
|
||||
| `MATRIX_USER_ID` | matrix user ID |
|
||||
| `MATRIX_ACCESS_TOKEN` | matrix access token |
|
||||
| `SMTP_HOST` | email server host |
|
||||
| `SMTP_PORT` | email server port (default 465) |
|
||||
|
|
@ -153,15 +184,32 @@ cp .env.example .env
|
|||
| `SMTP_PASS` | email password |
|
||||
| `FROM_EMAIL` | from address for emails |
|
||||
|
||||
you need at least ONE delivery method configured for intros to be sent.
|
||||
### forge tokens
|
||||
|
||||
for creating issues on self-hosted git forges:
|
||||
|
||||
| variable | description |
|
||||
|----------|-------------|
|
||||
| `CODEBERG_TOKEN` | codeberg.org access token |
|
||||
| `GITEA_TOKEN_<instance>` | gitea/forgejo token (e.g. `GITEA_TOKEN_git_example_com`) |
|
||||
| `GITLAB_TOKEN_<instance>` | gitlab CE token (e.g. `GITLAB_TOKEN_gitlab_example_com`) |
|
||||
|
||||
instance names use underscores: `git.example.com` → `GITEA_TOKEN_git_example_com`
|
||||
|
||||
for ports: `192.168.1.8:3000` → `GITEA_TOKEN_192_168_1_8_3000`
|
||||
|
||||
## architecture
|
||||
|
||||
```
|
||||
scoutd/ - discovery modules (one per platform)
|
||||
forges.py - gitea/forgejo/gitlab/gogs/sourcehut scraper
|
||||
handles.py - cross-platform handle discovery
|
||||
matchd/ - matching + fingerprinting logic
|
||||
introd/ - intro drafting + delivery
|
||||
deliver.py - multi-channel delivery with fallback
|
||||
groq_draft.py - LLM-powered intro generation
|
||||
db/ - sqlite storage
|
||||
central_client.py - distributed coordination
|
||||
config.py - central configuration
|
||||
daemon.py - continuous runner
|
||||
cli.py - command line interface
|
||||
|
|
@ -171,8 +219,8 @@ cli.py - command line interface
|
|||
|
||||
- scout: every 4 hours
|
||||
- match: every 1 hour
|
||||
- stranger intros: every 2 hours (max 20/day)
|
||||
- lost builder intros: every 6 hours (max 5/day)
|
||||
- intros: every 2 hours (max 1000/day)
|
||||
- lost builder intros: every 6 hours (max 100/day)
|
||||
|
||||
## forking
|
||||
|
||||
|
|
|
|||
BIN
Screenshot_20251216_021234.png
Normal file
BIN
Screenshot_20251216_021234.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 130 KiB |
976
api.py.backup.20251215_210704
Normal file
976
api.py.backup.20251215_210704
Normal file
|
|
@ -0,0 +1,976 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
connectd/api.py - REST API for stats and control
|
||||
|
||||
exposes daemon stats for home assistant integration.
|
||||
runs on port 8099 by default.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import threading
|
||||
from http.server import HTTPServer, BaseHTTPRequestHandler
|
||||
from datetime import datetime
|
||||
|
||||
from db import Database
|
||||
from db.users import get_priority_users, get_priority_user_matches, get_priority_user
|
||||
|
||||
API_PORT = int(os.environ.get('CONNECTD_API_PORT', 8099))
|
||||
|
||||
# shared state (updated by daemon)
|
||||
_daemon_state = {
|
||||
'running': False,
|
||||
'dry_run': False,
|
||||
'last_scout': None,
|
||||
'last_match': None,
|
||||
'last_intro': None,
|
||||
'last_lost': None,
|
||||
'intros_today': 0,
|
||||
'lost_intros_today': 0,
|
||||
'started_at': None,
|
||||
}
|
||||
|
||||
|
||||
def update_daemon_state(state_dict):
|
||||
"""update shared daemon state (called by daemon)"""
|
||||
global _daemon_state
|
||||
_daemon_state.update(state_dict)
|
||||
|
||||
|
||||
def get_daemon_state():
|
||||
"""get current daemon state"""
|
||||
return _daemon_state.copy()
|
||||
|
||||
|
||||
|
||||
DASHBOARD_HTML = """<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>connectd</title>
|
||||
<meta charset=utf-8>
|
||||
<link rel="icon" type="image/png" href="/favicon.png">
|
||||
<style>
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
body{font-family:monospace;background:#0a0a0f;color:#0f8;padding:20px}
|
||||
h1{color:#c792ea;margin-bottom:15px}
|
||||
h2{color:#82aaff;margin:15px 0 10px}
|
||||
.stats{display:flex;gap:12px;flex-wrap:wrap;margin-bottom:15px}
|
||||
.stat{background:#1a1a2e;padding:10px 16px;border-radius:6px;border:1px solid #333;text-align:center}
|
||||
.stat b{font-size:1.6em;color:#c792ea;display:block}
|
||||
.stat small{color:#666;font-size:.75em}
|
||||
.card{background:#1a1a2e;border:1px solid #333;border-radius:6px;padding:10px;margin-bottom:8px;cursor:pointer}
|
||||
.card:hover{border-color:#0f8}
|
||||
.card-hdr{display:flex;justify-content:space-between;color:#82aaff}
|
||||
.score{background:#2a2a4e;padding:2px 8px;border-radius:4px;color:#c792ea}
|
||||
.body{background:#0d0d15;padding:10px;border-radius:4px;white-space:pre-wrap;color:#ddd;margin-top:8px;font-size:.85em}
|
||||
.meta{color:#666;font-size:.75em;margin-top:5px}
|
||||
.m{display:inline-block;padding:1px 5px;border-radius:3px;font-size:.75em}
|
||||
.m-email{background:#2d4a2d;color:#8f8}
|
||||
.m-mastodon{background:#3d3a5c;color:#c792ea}
|
||||
.m-new{background:#2d3a4a;color:#82aaff}
|
||||
.tabs{margin-bottom:12px}
|
||||
.tab{background:#1a1a2e;border:1px solid #333;color:#0f8;padding:6px 14px;cursor:pointer;font-family:monospace;font-size:.9em}
|
||||
.tab.on{background:#2a2a4e;border-color:#0f8}
|
||||
.pnl{display:none}
|
||||
.pnl.on{display:block}
|
||||
.btn{background:#0f8;color:#0a0a0f;border:none;padding:6px 14px;cursor:pointer;font-family:monospace;font-weight:bold;margin-left:10px;font-size:.9em}
|
||||
.err{color:#f66}
|
||||
a{color:#82aaff}
|
||||
.status{font-size:.85em;color:#888;margin-bottom:10px}
|
||||
.status b{color:#0f8}
|
||||
.cached{color:#555;font-size:.7em}
|
||||
.to{color:#f7c}
|
||||
.about{color:#82aaff}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>connectd <a href="https://github.com/sudoxnym/connectd" style="font-size:.5em;color:#82aaff">repo</a> <a href="https://github.com/connectd-daemon" style="font-size:.5em;color:#f7c">org</a></h1>
|
||||
<div class="status" id="status"></div>
|
||||
<div class="stats" id="stats"></div>
|
||||
<div class="tabs">
|
||||
<button class="tab on" onclick="show('host')">you</button>
|
||||
<button class="tab" onclick="show('queue')">queue</button>
|
||||
<button class="tab" onclick="show('sent')">sent</button>
|
||||
<button class="tab" onclick="show('failed')">failed</button>
|
||||
<button class="btn" onclick="load()">refresh</button>
|
||||
</div>
|
||||
<div id="host" class="pnl on"></div>
|
||||
<div id="queue" class="pnl"></div>
|
||||
<div id="sent" class="pnl"></div>
|
||||
<div id="failed" class="pnl"></div>
|
||||
<script>
|
||||
async function loadStats(){
|
||||
var sr=await fetch('/api/stats'),hr=await fetch('/api/host');
|
||||
var s=await sr.json(),h=await hr.json();
|
||||
var up=h.uptime_seconds?Math.floor(h.uptime_seconds/3600)+'h '+Math.floor((h.uptime_seconds%3600)/60)+'m':'0m';
|
||||
document.getElementById('status').innerHTML='daemon <b>'+(h.running?'ON':'OFF')+'</b> | '+up+' | '+h.intros_today+' today';
|
||||
document.getElementById('stats').innerHTML='<div class="stat"><b>'+s.total_humans+'</b><small>humans</small></div><div class="stat"><b>'+s.total_matches+'</b><small>matches</small></div><div class="stat"><b>'+h.score_90_plus+'</b><small>90+</small></div><div class="stat"><b>'+h.score_80_89+'</b><small>80+</small></div><div class="stat"><b>'+h.matches_pending+'</b><small>queue</small></div><div class="stat"><b>'+s.sent_intros+'</b><small>sent</small></div>';
|
||||
}
|
||||
async function loadHost(){
|
||||
var r=await fetch('/api/host_matches?limit=20'),d=await r.json();
|
||||
var c='<h2>your matches ('+d.host+')</h2>';
|
||||
c+='<p style="color:#666;font-size:.8em;margin-bottom:10px">each match = 2 intros (one to you, one to them)</p>';
|
||||
if(!d.matches||!d.matches.length){c+='<div class="meta">no matches yet</div>';}
|
||||
for(var i=0;i<(d.matches||[]).length;i++){
|
||||
var m=d.matches[i];
|
||||
c+='<div class="card" onclick="prevHost('+m.id+',1,this)"><div class="card-hdr"><span class="to">TO: you</span><span class="score">'+m.score+'</span></div><div class="meta"><span class="about">ABOUT: '+m.other_user+'</span> ('+m.other_platform+')</div><div class="meta">'+(m.reasons||[]).slice(0,2).join(', ')+'</div><div id="h'+m.id+'a" class="body" style="display:none"></div></div>';
|
||||
c+='<div class="card" onclick="prevHost('+m.id+',2,this)"><div class="card-hdr"><span class="to">TO: '+m.other_user+'</span><span class="score">'+m.score+'</span></div><div class="meta"><span class="about">ABOUT: you</span></div><div class="meta">'+(m.contact||'no contact')+'</div><div id="h'+m.id+'b" class="body" style="display:none"></div></div>';
|
||||
}
|
||||
document.getElementById('host').innerHTML=c;
|
||||
}
|
||||
async function prevHost(id,dir,card){
|
||||
var el=document.getElementById('h'+id+(dir==1?'a':'b'));
|
||||
if(el.style.display!='none'){el.style.display='none';return;}
|
||||
el.innerHTML='loading...';el.style.display='block';
|
||||
var r=await fetch('/api/preview_host_draft?id='+id+'&dir='+(dir==1?'to_you':'to_them'));
|
||||
var d=await r.json();
|
||||
if(d.error){el.innerHTML='<span class="err">'+d.error+'</span>';}
|
||||
else{el.innerHTML='<b>SUBJ:</b> '+d.subject+(d.cached?' <span class="cached">(cached)</span>':'')+'<br><br>'+d.draft;}
|
||||
}
|
||||
async function loadQueue(){
|
||||
var r=await fetch('/api/pending_matches?limit=40'),d=await r.json();
|
||||
var c='<h2>outreach queue</h2>';
|
||||
if(!d.matches||!d.matches.length){c+='<div class="meta">empty</div>';}
|
||||
for(var i=0;i<(d.matches||[]).length;i++){
|
||||
var p=d.matches[i];
|
||||
c+='<div class="card" onclick="prevQ('+p.id+',this)"><div class="card-hdr"><span class="to">TO: '+p.to_user+'</span><span class="score">'+p.score+'</span></div><div class="meta"><span class="about">ABOUT: '+p.about_user+'</span> | <span class="m m-'+(p.method||'new')+'">'+(p.method||'?')+'</span> '+(p.contact||'')+'</div><div id="q'+p.id+'_'+i+'" class="body" style="display:none"></div></div>';
|
||||
}
|
||||
document.getElementById('queue').innerHTML=c;
|
||||
}
|
||||
async function prevQ(id,card){
|
||||
var el=card.querySelector('.body');
|
||||
if(el.style.display!='none'){el.style.display='none';return;}
|
||||
el.innerHTML='loading...';el.style.display='block';
|
||||
var r=await fetch('/api/preview_draft?id='+id);
|
||||
var d=await r.json();
|
||||
if(d.error){el.innerHTML='<span class="err">'+d.error+'</span>';}
|
||||
else{el.innerHTML='<b>TO:</b> '+d.to+'\n<b>ABOUT:</b> '+d.about+'\n<b>SUBJ:</b> '+d.subject+(d.cached?' <span class="cached">(cached)</span>':'')+'<br><br>'+d.draft;}
|
||||
}
|
||||
async function loadSent(){var r=await fetch('/api/sent_intros'),d=await r.json();var c='<h2>sent</h2>';for(var i=0;i<(d.sent||[]).length;i++){var s=d.sent[i];c+='<div class="card"><div class="card-hdr">TO: '+s.recipient_id+' <span class="m m-'+s.method+'">'+s.method+'</span></div><div class="body">'+(s.draft||'-')+'</div><div class="meta">'+s.timestamp+'</div></div>';}document.getElementById('sent').innerHTML=c;}
|
||||
async function loadFailed(){var r=await fetch('/api/failed_intros'),d=await r.json();var c='<h2>failed</h2>';for(var i=0;i<(d.failed||[]).length;i++){var f=d.failed[i];c+='<div class="card"><div class="card-hdr">'+f.recipient_id+'</div><div class="meta err">'+f.error+'</div></div>';}document.getElementById('failed').innerHTML=c;}
|
||||
function show(n){document.querySelectorAll('.pnl').forEach(function(e){e.classList.remove('on')});document.querySelectorAll('.tab').forEach(function(e){e.classList.remove('on')});document.getElementById(n).classList.add('on');event.target.classList.add('on');}
|
||||
function load(){loadStats();loadHost();loadQueue();loadSent();loadFailed();}
|
||||
load();setInterval(load,60000);
|
||||
</script>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
|
||||
# draft cache - stores generated drafts so they dont regenerate
|
||||
_draft_cache = {}
|
||||
|
||||
def get_cached_draft(match_id, match_type='match'):
|
||||
key = f"{match_type}:{match_id}"
|
||||
return _draft_cache.get(key)
|
||||
|
||||
def cache_draft(match_id, draft_data, match_type='match'):
|
||||
key = f"{match_type}:{match_id}"
|
||||
_draft_cache[key] = draft_data
|
||||
|
||||
class APIHandler(BaseHTTPRequestHandler):
|
||||
"""simple REST API handler"""
|
||||
|
||||
def log_message(self, format, *args):
|
||||
"""suppress default logging"""
|
||||
pass
|
||||
|
||||
def _send_json(self, data, status=200):
|
||||
"""send JSON response"""
|
||||
self.send_response(status)
|
||||
self.send_header('Content-Type', 'application/json')
|
||||
self.send_header('Access-Control-Allow-Origin', '*')
|
||||
self.end_headers()
|
||||
self.wfile.write(json.dumps(data).encode())
|
||||
|
||||
def do_GET(self):
|
||||
"""handle GET requests"""
|
||||
path = self.path.split('?')[0]
|
||||
if path == '/favicon.png' or path == '/favicon.ico':
|
||||
self._handle_favicon()
|
||||
elif path == '/' or path == '/dashboard':
|
||||
self._handle_dashboard()
|
||||
elif path == '/api/stats':
|
||||
self._handle_stats()
|
||||
elif path == '/api/host':
|
||||
self._handle_host()
|
||||
elif path == '/api/host_matches':
|
||||
self._handle_host_matches()
|
||||
elif path == '/api/your_matches':
|
||||
self._handle_your_matches()
|
||||
elif path == '/api/preview_match_draft':
|
||||
self._handle_preview_match_draft()
|
||||
elif path == '/api/preview_host_draft':
|
||||
self._handle_preview_host_draft()
|
||||
elif path == '/api/preview_draft':
|
||||
self._handle_preview_draft()
|
||||
elif path == '/api/pending_about_you':
|
||||
self._handle_pending_about_you()
|
||||
elif path == '/api/pending_to_you':
|
||||
self._handle_pending_to_you()
|
||||
elif path == '/api/pending_matches':
|
||||
self._handle_pending_matches()
|
||||
elif path == '/api/sent_intros':
|
||||
self._handle_sent_intros()
|
||||
elif path == '/api/failed_intros':
|
||||
self._handle_failed_intros()
|
||||
elif path == '/api/health':
|
||||
self._handle_health()
|
||||
elif path == '/api/state':
|
||||
self._handle_state()
|
||||
elif path == '/api/priority_matches':
|
||||
self._handle_priority_matches()
|
||||
elif path == '/api/top_humans':
|
||||
self._handle_top_humans()
|
||||
elif path == '/api/user':
|
||||
self._handle_user()
|
||||
else:
|
||||
self._send_json({'error': 'not found'}, 404)
|
||||
def _handle_favicon(self):
|
||||
from pathlib import Path
|
||||
fav = Path('/app/data/favicon.png')
|
||||
if fav.exists():
|
||||
self.send_response(200)
|
||||
self.send_header('Content-Type', 'image/png')
|
||||
self.end_headers()
|
||||
self.wfile.write(fav.read_bytes())
|
||||
else:
|
||||
self.send_response(404)
|
||||
self.end_headers()
|
||||
|
||||
def _handle_dashboard(self):
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "text/html")
|
||||
self.end_headers()
|
||||
self.wfile.write(DASHBOARD_HTML.encode())
|
||||
|
||||
def _handle_sent_intros(self):
|
||||
from pathlib import Path
|
||||
log_path = Path("/app/data/delivery_log.json")
|
||||
sent = []
|
||||
if log_path.exists():
|
||||
with open(log_path) as f:
|
||||
log = json.load(f)
|
||||
sent = log.get("sent", [])[-20:]
|
||||
sent.reverse()
|
||||
self._send_json({"sent": sent})
|
||||
|
||||
def _handle_failed_intros(self):
|
||||
from pathlib import Path
|
||||
log_path = Path("/app/data/delivery_log.json")
|
||||
failed = []
|
||||
if log_path.exists():
|
||||
with open(log_path) as f:
|
||||
log = json.load(f)
|
||||
failed = log.get("failed", [])
|
||||
self._send_json({"failed": failed})
|
||||
|
||||
def _handle_host(self):
|
||||
"""daemon status and match stats"""
|
||||
import sqlite3
|
||||
state = get_daemon_state()
|
||||
try:
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE status='pending' AND overlap_score >= 60")
|
||||
pending = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE status='intro_sent'")
|
||||
sent = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE status='rejected'")
|
||||
rejected = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches")
|
||||
total = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 90")
|
||||
s90 = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 80 AND overlap_score < 90")
|
||||
s80 = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 70 AND overlap_score < 80")
|
||||
s70 = c.fetchone()[0]
|
||||
c.execute("SELECT COUNT(*) FROM matches WHERE overlap_score >= 60 AND overlap_score < 70")
|
||||
s60 = c.fetchone()[0]
|
||||
conn.close()
|
||||
except:
|
||||
pending = sent = rejected = total = s90 = s80 = s70 = s60 = 0
|
||||
uptime = None
|
||||
if state.get('started_at'):
|
||||
try:
|
||||
start = datetime.fromisoformat(state['started_at']) if isinstance(state['started_at'], str) else state['started_at']
|
||||
uptime = int((datetime.now() - start).total_seconds())
|
||||
except: pass
|
||||
self._send_json({
|
||||
'running': state.get('running', False), 'dry_run': state.get('dry_run', False),
|
||||
'uptime_seconds': uptime, 'intros_today': state.get('intros_today', 0),
|
||||
'matches_pending': pending, 'matches_sent': sent, 'matches_rejected': rejected, 'matches_total': total,
|
||||
'score_90_plus': s90, 'score_80_89': s80, 'score_70_79': s70, 'score_60_69': s60,
|
||||
})
|
||||
|
||||
def _handle_your_matches(self):
|
||||
"""matches involving the host - shows both directions"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from db.users import get_priority_users
|
||||
limit = 15
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('limit='):
|
||||
try: limit = int(p.split('=')[1])
|
||||
except: pass
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'matches': [], 'host': None})
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
host_name = host.get('github') or host.get('name')
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT m.id, m.overlap_score, m.overlap_reasons, m.status,
|
||||
h1.username, h1.platform, h1.contact,
|
||||
h2.username, h2.platform, h2.contact
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE (h1.username = ? OR h2.username = ?)
|
||||
AND m.status = 'pending' AND m.overlap_score >= 60
|
||||
ORDER BY m.overlap_score DESC LIMIT ?""", (host_name, host_name, limit))
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
if row[4] == host_name:
|
||||
other_user, other_platform = row[7], row[8]
|
||||
other_contact = j.loads(row[9]) if row[9] else {}
|
||||
else:
|
||||
other_user, other_platform = row[4], row[5]
|
||||
other_contact = j.loads(row[6]) if row[6] else {}
|
||||
reasons = j.loads(row[2]) if row[2] else []
|
||||
matches.append({
|
||||
'id': row[0], 'score': int(row[1]), 'reasons': reasons,
|
||||
'status': row[3], 'other_user': other_user, 'other_platform': other_platform,
|
||||
'contact': other_contact.get('email') or other_contact.get('mastodon') or ''
|
||||
})
|
||||
conn.close()
|
||||
db.close()
|
||||
self._send_json({'host': host_name, 'matches': matches})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_preview_match_draft(self):
|
||||
"""preview draft for a match - dir=to_you or to_them"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from introd.groq_draft import draft_intro_with_llm
|
||||
from db.users import get_priority_users
|
||||
|
||||
match_id = None
|
||||
direction = 'to_you'
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('id='):
|
||||
try: match_id = int(p.split('=')[1])
|
||||
except: pass
|
||||
if p.startswith('dir='):
|
||||
direction = p.split('=')[1]
|
||||
|
||||
if not match_id:
|
||||
self._send_json({'error': 'need ?id=match_id'}, 400)
|
||||
return
|
||||
|
||||
cache_key = f"{match_id}_{direction}"
|
||||
cached = get_cached_draft(cache_key, 'match')
|
||||
if cached:
|
||||
cached['cached'] = True
|
||||
self._send_json(cached)
|
||||
return
|
||||
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'error': 'no priority user'}, 404)
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
host_name = host.get('github') or host.get('name')
|
||||
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
|
||||
h2.username, h2.platform, h2.contact, h2.extra,
|
||||
m.overlap_score, m.overlap_reasons
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE m.id = ?""", (match_id,))
|
||||
row = c.fetchone()
|
||||
conn.close()
|
||||
db.close()
|
||||
|
||||
if not row:
|
||||
self._send_json({'error': 'match not found'}, 404)
|
||||
return
|
||||
|
||||
human_a = {'username': row[0], 'platform': row[1],
|
||||
'contact': j.loads(row[2]) if row[2] else {},
|
||||
'extra': j.loads(row[3]) if row[3] else {}}
|
||||
human_b = {'username': row[4], 'platform': row[5],
|
||||
'contact': j.loads(row[6]) if row[6] else {},
|
||||
'extra': j.loads(row[7]) if row[7] else {}}
|
||||
reasons = j.loads(row[9]) if row[9] else []
|
||||
|
||||
if human_a['username'] == host_name:
|
||||
host_human, other_human = human_a, human_b
|
||||
else:
|
||||
host_human, other_human = human_b, human_a
|
||||
|
||||
if direction == 'to_you':
|
||||
match_data = {'human_a': host_human, 'human_b': other_human,
|
||||
'overlap_score': row[8], 'overlap_reasons': reasons}
|
||||
recipient_name = host_name
|
||||
about_name = other_human['username']
|
||||
else:
|
||||
match_data = {'human_a': other_human, 'human_b': host_human,
|
||||
'overlap_score': row[8], 'overlap_reasons': reasons}
|
||||
recipient_name = other_human['username']
|
||||
about_name = host_name
|
||||
|
||||
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
|
||||
if error:
|
||||
self._send_json({'error': error}, 500)
|
||||
return
|
||||
|
||||
response = {
|
||||
'match_id': match_id,
|
||||
'direction': direction,
|
||||
'to': recipient_name,
|
||||
'about': about_name,
|
||||
'subject': result.get('subject'),
|
||||
'draft': result.get('draft'),
|
||||
'score': row[8],
|
||||
'cached': False,
|
||||
}
|
||||
cache_draft(cache_key, response, 'match')
|
||||
self._send_json(response)
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_host_matches(self):
|
||||
"""matches for priority user"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from db.users import get_priority_users
|
||||
limit = 20
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('limit='):
|
||||
try: limit = int(p.split('=')[1])
|
||||
except: pass
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'matches': [], 'host': None})
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT pm.id, pm.overlap_score, pm.overlap_reasons, pm.status, h.username, h.platform, h.contact
|
||||
FROM priority_matches pm JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ? ORDER BY pm.overlap_score DESC LIMIT ?""", (host['id'], limit))
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
reasons = j.loads(row[2]) if row[2] else []
|
||||
contact = j.loads(row[6]) if row[6] else {}
|
||||
matches.append({'id': row[0], 'score': int(row[1]), 'reasons': reasons, 'status': row[3],
|
||||
'other_user': row[4], 'other_platform': row[5],
|
||||
'contact': contact.get('email') or contact.get('mastodon') or contact.get('github') or ''})
|
||||
conn.close()
|
||||
db.close()
|
||||
self._send_json({'host': host.get('github') or host.get('name'), 'matches': matches})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_preview_host_draft(self):
|
||||
"""preview draft for a priority match - dir=to_you or to_them"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from introd.groq_draft import draft_intro_with_llm
|
||||
from db.users import get_priority_users
|
||||
|
||||
match_id = None
|
||||
direction = 'to_you'
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('id='):
|
||||
try: match_id = int(p.split('=')[1])
|
||||
except: pass
|
||||
if p.startswith('dir='):
|
||||
direction = p.split('=')[1]
|
||||
|
||||
if not match_id:
|
||||
self._send_json({'error': 'need ?id=match_id'}, 400)
|
||||
return
|
||||
|
||||
cache_key = f"host_{match_id}_{direction}"
|
||||
cached = get_cached_draft(cache_key, 'host')
|
||||
if cached:
|
||||
cached['cached'] = True
|
||||
self._send_json(cached)
|
||||
return
|
||||
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'error': 'no priority user'}, 404)
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
# Get the matched human from priority_matches
|
||||
c.execute("""SELECT h.username, h.platform, h.contact, h.extra, pm.overlap_score, pm.overlap_reasons
|
||||
FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.id = ?""", (match_id,))
|
||||
row = c.fetchone()
|
||||
conn.close()
|
||||
db.close()
|
||||
|
||||
if not row:
|
||||
self._send_json({'error': 'match not found'}, 404)
|
||||
return
|
||||
|
||||
# The matched person (who we found for the host)
|
||||
other = {'username': row[0], 'platform': row[1],
|
||||
'contact': j.loads(row[2]) if row[2] else {},
|
||||
'extra': j.loads(row[3]) if row[3] else {}}
|
||||
|
||||
# Build host as human_a (recipient), other as human_b (subject)
|
||||
host_human = {'username': host.get('github') or host.get('name'),
|
||||
'platform': 'priority',
|
||||
'contact': {'email': host.get('email'), 'mastodon': host.get('mastodon'), 'github': host.get('github')},
|
||||
'extra': {'bio': host.get('bio'), 'interests': host.get('interests')}}
|
||||
|
||||
reasons = j.loads(row[5]) if row[5] else []
|
||||
match_data = {'human_a': host_human, 'human_b': other,
|
||||
'overlap_score': row[4], 'overlap_reasons': reasons}
|
||||
|
||||
# direction determines who gets the intro
|
||||
if direction == 'to_you':
|
||||
# intro TO host ABOUT other
|
||||
match_data = {'human_a': host_human, 'human_b': other,
|
||||
'overlap_score': row[4], 'overlap_reasons': reasons}
|
||||
to_name = host.get('github') or host.get('name')
|
||||
about_name = other['username']
|
||||
else:
|
||||
# intro TO other ABOUT host
|
||||
match_data = {'human_a': other, 'human_b': host_human,
|
||||
'overlap_score': row[4], 'overlap_reasons': reasons}
|
||||
to_name = other['username']
|
||||
about_name = host.get('github') or host.get('name')
|
||||
|
||||
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
|
||||
if error:
|
||||
self._send_json({'error': error}, 500)
|
||||
return
|
||||
|
||||
cache_key = f"host_{match_id}_{direction}"
|
||||
response = {
|
||||
'match_id': match_id,
|
||||
'direction': direction,
|
||||
'to': to_name,
|
||||
'about': about_name,
|
||||
'subject': result.get('subject'),
|
||||
'draft': result.get('draft'),
|
||||
'score': row[4],
|
||||
'cached': False,
|
||||
}
|
||||
cache_draft(cache_key, response, 'host')
|
||||
self._send_json(response)
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_preview_draft(self):
|
||||
import sqlite3
|
||||
import json as j
|
||||
from introd.groq_draft import draft_intro_with_llm
|
||||
|
||||
match_id = None
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('id='):
|
||||
try: match_id = int(p.split('=')[1])
|
||||
except: pass
|
||||
|
||||
if not match_id:
|
||||
self._send_json({'error': 'need ?id=match_id'}, 400)
|
||||
return
|
||||
|
||||
# check cache first
|
||||
cached = get_cached_draft(match_id, 'queue')
|
||||
if cached:
|
||||
cached['cached'] = True
|
||||
self._send_json(cached)
|
||||
return
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
|
||||
h2.username, h2.platform, h2.contact, h2.extra,
|
||||
m.overlap_score, m.overlap_reasons
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE m.id = ?""", (match_id,))
|
||||
row = c.fetchone()
|
||||
conn.close()
|
||||
|
||||
if not row:
|
||||
self._send_json({'error': 'match not found'}, 404)
|
||||
return
|
||||
|
||||
human_a = {'username': row[0], 'platform': row[1],
|
||||
'contact': j.loads(row[2]) if row[2] else {},
|
||||
'extra': j.loads(row[3]) if row[3] else {}}
|
||||
human_b = {'username': row[4], 'platform': row[5],
|
||||
'contact': j.loads(row[6]) if row[6] else {},
|
||||
'extra': j.loads(row[7]) if row[7] else {}}
|
||||
reasons = j.loads(row[9]) if row[9] else []
|
||||
|
||||
match_data = {'human_a': human_a, 'human_b': human_b,
|
||||
'overlap_score': row[8], 'overlap_reasons': reasons}
|
||||
|
||||
result, error = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
|
||||
if error:
|
||||
self._send_json({'error': error}, 500)
|
||||
return
|
||||
|
||||
response = {
|
||||
'match_id': match_id,
|
||||
'to': human_a['username'],
|
||||
'about': human_b['username'],
|
||||
'subject': result.get('subject'),
|
||||
'draft': result.get('draft'),
|
||||
'score': row[8],
|
||||
'cached': False,
|
||||
}
|
||||
cache_draft(match_id, response, 'queue')
|
||||
self._send_json(response)
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_pending_about_you(self):
|
||||
"""pending intros where host is human_b (being introduced to others)"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from db.users import get_priority_users
|
||||
limit = 10
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('limit='):
|
||||
try: limit = int(p.split('=')[1])
|
||||
except: pass
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'matches': []})
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
host_name = host.get('github') or host.get('name')
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT m.id, h1.username, h1.platform, h1.contact,
|
||||
m.overlap_score, m.overlap_reasons
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE h2.username = ? AND m.status = 'pending' AND m.overlap_score >= 60
|
||||
ORDER BY m.overlap_score DESC LIMIT ?""", (host_name, limit))
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
contact = j.loads(row[3]) if row[3] else {}
|
||||
reasons = j.loads(row[5]) if row[5] else []
|
||||
method = 'email' if contact.get('email') else ('mastodon' if contact.get('mastodon') else None)
|
||||
matches.append({'id': row[0], 'to_user': row[1], 'to_platform': row[2],
|
||||
'score': int(row[4]), 'reasons': reasons[:3], 'method': method,
|
||||
'contact': contact.get('email') or contact.get('mastodon') or ''})
|
||||
conn.close()
|
||||
db.close()
|
||||
self._send_json({'matches': matches})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_pending_to_you(self):
|
||||
"""pending intros where host is human_a (receiving intro about others)"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
from db.users import get_priority_users
|
||||
limit = 20
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('limit='):
|
||||
try: limit = int(p.split('=')[1])
|
||||
except: pass
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
self._send_json({'matches': []})
|
||||
db.close()
|
||||
return
|
||||
host = users[0]
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT pm.id, h.username, h.platform, pm.overlap_score, pm.overlap_reasons
|
||||
FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ? AND pm.status IN ('new', 'pending')
|
||||
ORDER BY pm.overlap_score DESC LIMIT ?""", (host['id'], limit))
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
reasons = j.loads(row[4]) if row[4] else []
|
||||
matches.append({'id': row[0], 'about_user': row[1], 'about_platform': row[2],
|
||||
'score': int(row[3]), 'reasons': reasons[:3]})
|
||||
conn.close()
|
||||
db.close()
|
||||
self._send_json({'matches': matches})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_pending_matches(self):
|
||||
"""pending matches - returns BOTH directions for each match"""
|
||||
import sqlite3
|
||||
import json as j
|
||||
limit = 30
|
||||
if '?' in self.path:
|
||||
for p in self.path.split('?')[1].split('&'):
|
||||
if p.startswith('limit='):
|
||||
try: limit = int(p.split('=')[1])
|
||||
except: pass
|
||||
try:
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
c.execute("""SELECT m.id, h1.username, h1.platform, h1.contact,
|
||||
h2.username, h2.platform, h2.contact, m.overlap_score, m.overlap_reasons
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE m.status = 'pending' AND m.overlap_score >= 60
|
||||
ORDER BY m.overlap_score DESC LIMIT ?""", (limit // 2,))
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
contact_a = j.loads(row[3]) if row[3] else {}
|
||||
contact_b = j.loads(row[6]) if row[6] else {}
|
||||
reasons = j.loads(row[8]) if row[8] else []
|
||||
# direction 1: TO human_a ABOUT human_b
|
||||
method_a = 'email' if contact_a.get('email') else ('mastodon' if contact_a.get('mastodon') else None)
|
||||
matches.append({'id': row[0], 'to_user': row[1], 'about_user': row[4],
|
||||
'score': int(row[7]), 'reasons': reasons[:3], 'method': method_a,
|
||||
'contact': contact_a.get('email') or contact_a.get('mastodon') or ''})
|
||||
# direction 2: TO human_b ABOUT human_a
|
||||
method_b = 'email' if contact_b.get('email') else ('mastodon' if contact_b.get('mastodon') else None)
|
||||
matches.append({'id': row[0], 'to_user': row[4], 'about_user': row[1],
|
||||
'score': int(row[7]), 'reasons': reasons[:3], 'method': method_b,
|
||||
'contact': contact_b.get('email') or contact_b.get('mastodon') or ''})
|
||||
conn.close()
|
||||
self._send_json({'matches': matches})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_stats(self):
|
||||
"""return database statistics"""
|
||||
try:
|
||||
db = Database()
|
||||
stats = db.stats()
|
||||
db.close()
|
||||
self._send_json(stats)
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_health(self):
|
||||
"""return daemon health status"""
|
||||
state = get_daemon_state()
|
||||
|
||||
health = {
|
||||
'status': 'running' if state['running'] else 'stopped',
|
||||
'dry_run': state['dry_run'],
|
||||
'uptime_seconds': None,
|
||||
}
|
||||
|
||||
if state['started_at']:
|
||||
uptime = datetime.now() - datetime.fromisoformat(state['started_at'])
|
||||
health['uptime_seconds'] = int(uptime.total_seconds())
|
||||
|
||||
self._send_json(health)
|
||||
|
||||
def _handle_state(self):
|
||||
"""return full daemon state"""
|
||||
state = get_daemon_state()
|
||||
|
||||
# convert datetimes to strings
|
||||
for key in ['last_scout', 'last_match', 'last_intro', 'last_lost', 'started_at']:
|
||||
if state[key] and isinstance(state[key], datetime):
|
||||
state[key] = state[key].isoformat()
|
||||
|
||||
self._send_json(state)
|
||||
|
||||
def _handle_priority_matches(self):
|
||||
"""return priority matches for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
self._send_json({
|
||||
'count': 0,
|
||||
'new_count': 0,
|
||||
'top_matches': [],
|
||||
})
|
||||
db.close()
|
||||
return
|
||||
|
||||
# get matches for first priority user (host)
|
||||
user = users[0]
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=10)
|
||||
|
||||
new_count = sum(1 for m in matches if m.get('status') == 'new')
|
||||
|
||||
top_matches = []
|
||||
for m in matches[:5]:
|
||||
overlap_reasons = m.get('overlap_reasons', '[]')
|
||||
if isinstance(overlap_reasons, str):
|
||||
import json as json_mod
|
||||
overlap_reasons = json_mod.loads(overlap_reasons) if overlap_reasons else []
|
||||
|
||||
top_matches.append({
|
||||
'username': m.get('username'),
|
||||
'platform': m.get('platform'),
|
||||
'score': m.get('score', 0),
|
||||
'overlap_score': m.get('overlap_score', 0),
|
||||
'reasons': overlap_reasons[:3],
|
||||
'url': m.get('url'),
|
||||
'status': m.get('status', 'new'),
|
||||
})
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'count': len(matches),
|
||||
'new_count': new_count,
|
||||
'top_matches': top_matches,
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_top_humans(self):
|
||||
"""return top scoring humans for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
humans = db.get_all_humans(min_score=50, limit=5)
|
||||
|
||||
top_humans = []
|
||||
for h in humans:
|
||||
contact = h.get('contact', '{}')
|
||||
if isinstance(contact, str):
|
||||
import json as json_mod
|
||||
contact = json_mod.loads(contact) if contact else {}
|
||||
|
||||
signals = h.get('signals', '[]')
|
||||
if isinstance(signals, str):
|
||||
import json as json_mod
|
||||
signals = json_mod.loads(signals) if signals else []
|
||||
|
||||
top_humans.append({
|
||||
'username': h.get('username'),
|
||||
'platform': h.get('platform'),
|
||||
'score': h.get('score', 0),
|
||||
'name': h.get('name'),
|
||||
'signals': signals[:5],
|
||||
'contact_method': 'email' if contact.get('email') else
|
||||
'mastodon' if contact.get('mastodon') else
|
||||
'matrix' if contact.get('matrix') else 'manual',
|
||||
})
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'count': len(humans),
|
||||
'top_humans': top_humans,
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_user(self):
|
||||
"""return priority user info for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
self._send_json({
|
||||
'configured': False,
|
||||
'score': 0,
|
||||
'signals': [],
|
||||
'match_count': 0,
|
||||
})
|
||||
db.close()
|
||||
return
|
||||
|
||||
user = users[0]
|
||||
signals = user.get('signals', '[]')
|
||||
if isinstance(signals, str):
|
||||
import json as json_mod
|
||||
signals = json_mod.loads(signals) if signals else []
|
||||
|
||||
interests = user.get('interests', '[]')
|
||||
if isinstance(interests, str):
|
||||
import json as json_mod
|
||||
interests = json_mod.loads(interests) if interests else []
|
||||
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=100)
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'configured': True,
|
||||
'name': user.get('name'),
|
||||
'github': user.get('github'),
|
||||
'mastodon': user.get('mastodon'),
|
||||
'reddit': user.get('reddit'),
|
||||
'lobsters': user.get('lobsters'),
|
||||
'matrix': user.get('matrix'),
|
||||
'lemmy': user.get('lemmy'),
|
||||
'discord': user.get('discord'),
|
||||
'bluesky': user.get('bluesky'),
|
||||
'score': user.get('score', 0),
|
||||
'signals': signals[:10],
|
||||
'interests': interests,
|
||||
'location': user.get('location'),
|
||||
'bio': user.get('bio'),
|
||||
'match_count': len(matches),
|
||||
'new_match_count': sum(1 for m in matches if m.get('status') == 'new'),
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
|
||||
def run_api_server():
|
||||
"""run the API server in a thread"""
|
||||
server = HTTPServer(('0.0.0.0', API_PORT), APIHandler)
|
||||
print(f"connectd api running on port {API_PORT}")
|
||||
server.serve_forever()
|
||||
|
||||
|
||||
def start_api_thread():
|
||||
"""start API server in background thread"""
|
||||
thread = threading.Thread(target=run_api_server, daemon=True)
|
||||
thread.start()
|
||||
return thread
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# standalone mode for testing
|
||||
print(f"starting connectd api on port {API_PORT}...")
|
||||
run_api_server()
|
||||
1187
api.py.backup.20251215_221410
Normal file
1187
api.py.backup.20251215_221410
Normal file
File diff suppressed because it is too large
Load diff
608
api_orig.py
Normal file
608
api_orig.py
Normal file
|
|
@ -0,0 +1,608 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
connectd/api.py - REST API for stats and control
|
||||
|
||||
exposes daemon stats for home assistant integration.
|
||||
runs on port 8099 by default.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import threading
|
||||
from http.server import HTTPServer, BaseHTTPRequestHandler
|
||||
from datetime import datetime
|
||||
|
||||
from db import Database
|
||||
from db.users import get_priority_users, get_priority_user_matches, get_priority_user
|
||||
|
||||
API_PORT = int(os.environ.get('CONNECTD_API_PORT', 8099))
|
||||
|
||||
# shared state (updated by daemon)
|
||||
_daemon_state = {
|
||||
'running': False,
|
||||
'dry_run': False,
|
||||
'last_scout': None,
|
||||
'last_match': None,
|
||||
'last_intro': None,
|
||||
'last_lost': None,
|
||||
'intros_today': 0,
|
||||
'lost_intros_today': 0,
|
||||
'started_at': None,
|
||||
}
|
||||
|
||||
|
||||
def update_daemon_state(state_dict):
|
||||
"""update shared daemon state (called by daemon)"""
|
||||
global _daemon_state
|
||||
_daemon_state.update(state_dict)
|
||||
|
||||
|
||||
def get_daemon_state():
|
||||
"""get current daemon state"""
|
||||
return _daemon_state.copy()
|
||||
|
||||
|
||||
class APIHandler(BaseHTTPRequestHandler):
|
||||
"""simple REST API handler"""
|
||||
|
||||
def log_message(self, format, *args):
|
||||
"""suppress default logging"""
|
||||
pass
|
||||
|
||||
def _send_json(self, data, status=200):
|
||||
"""send JSON response"""
|
||||
self.send_response(status)
|
||||
self.send_header('Content-Type', 'application/json')
|
||||
self.send_header('Access-Control-Allow-Origin', '*')
|
||||
self.end_headers()
|
||||
self.wfile.write(json.dumps(data).encode())
|
||||
|
||||
def do_GET(self):
|
||||
"""handle GET requests"""
|
||||
path = self.path.split('?')[0] # strip query params for routing
|
||||
if path == '/api/stats':
|
||||
self._handle_stats()
|
||||
elif path == '/api/health':
|
||||
self._handle_health()
|
||||
elif path == '/api/state':
|
||||
self._handle_state()
|
||||
elif path == '/api/priority_matches':
|
||||
self._handle_priority_matches()
|
||||
elif path == '/api/top_humans':
|
||||
self._handle_top_humans()
|
||||
elif path == '/api/user':
|
||||
self._handle_user()
|
||||
elif path == '/dashboard' or path == '/':
|
||||
self._handle_dashboard()
|
||||
elif path == '/api/preview_intros':
|
||||
self._handle_preview_intros()
|
||||
elif path == '/api/sent_intros':
|
||||
self._handle_sent_intros()
|
||||
elif path == '/api/failed_intros':
|
||||
self._handle_failed_intros()
|
||||
else:
|
||||
self._send_json({'error': 'not found'}, 404)
|
||||
|
||||
def _handle_stats(self):
|
||||
"""return database statistics"""
|
||||
try:
|
||||
db = Database()
|
||||
stats = db.stats()
|
||||
db.close()
|
||||
self._send_json(stats)
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_health(self):
|
||||
"""return daemon health status"""
|
||||
state = get_daemon_state()
|
||||
|
||||
health = {
|
||||
'status': 'running' if state['running'] else 'stopped',
|
||||
'dry_run': state['dry_run'],
|
||||
'uptime_seconds': None,
|
||||
}
|
||||
|
||||
if state['started_at']:
|
||||
uptime = datetime.now() - datetime.fromisoformat(state['started_at'])
|
||||
health['uptime_seconds'] = int(uptime.total_seconds())
|
||||
|
||||
self._send_json(health)
|
||||
|
||||
def _handle_state(self):
|
||||
"""return full daemon state"""
|
||||
state = get_daemon_state()
|
||||
|
||||
# convert datetimes to strings
|
||||
for key in ['last_scout', 'last_match', 'last_intro', 'last_lost', 'started_at']:
|
||||
if state[key] and isinstance(state[key], datetime):
|
||||
state[key] = state[key].isoformat()
|
||||
|
||||
self._send_json(state)
|
||||
|
||||
def _handle_priority_matches(self):
|
||||
"""return priority matches for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
self._send_json({
|
||||
'count': 0,
|
||||
'new_count': 0,
|
||||
'top_matches': [],
|
||||
})
|
||||
db.close()
|
||||
return
|
||||
|
||||
# get matches for first priority user (host)
|
||||
user = users[0]
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=10)
|
||||
|
||||
new_count = sum(1 for m in matches if m.get('status') == 'new')
|
||||
|
||||
top_matches = []
|
||||
for m in matches[:5]:
|
||||
overlap_reasons = m.get('overlap_reasons', '[]')
|
||||
if isinstance(overlap_reasons, str):
|
||||
import json as json_mod
|
||||
overlap_reasons = json_mod.loads(overlap_reasons) if overlap_reasons else []
|
||||
|
||||
top_matches.append({
|
||||
'username': m.get('username'),
|
||||
'platform': m.get('platform'),
|
||||
'score': m.get('score', 0),
|
||||
'overlap_score': m.get('overlap_score', 0),
|
||||
'reasons': overlap_reasons[:3],
|
||||
'url': m.get('url'),
|
||||
'status': m.get('status', 'new'),
|
||||
})
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'count': len(matches),
|
||||
'new_count': new_count,
|
||||
'top_matches': top_matches,
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_top_humans(self):
|
||||
"""return top scoring humans for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
humans = db.get_all_humans(min_score=50, limit=5)
|
||||
|
||||
top_humans = []
|
||||
for h in humans:
|
||||
contact = h.get('contact', '{}')
|
||||
if isinstance(contact, str):
|
||||
import json as json_mod
|
||||
contact = json_mod.loads(contact) if contact else {}
|
||||
|
||||
signals = h.get('signals', '[]')
|
||||
if isinstance(signals, str):
|
||||
import json as json_mod
|
||||
signals = json_mod.loads(signals) if signals else []
|
||||
|
||||
top_humans.append({
|
||||
'username': h.get('username'),
|
||||
'platform': h.get('platform'),
|
||||
'score': h.get('score', 0),
|
||||
'name': h.get('name'),
|
||||
'signals': signals[:5],
|
||||
'contact_method': 'email' if contact.get('email') else
|
||||
'mastodon' if contact.get('mastodon') else
|
||||
'matrix' if contact.get('matrix') else 'manual',
|
||||
})
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'count': len(humans),
|
||||
'top_humans': top_humans,
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
def _handle_user(self):
|
||||
"""return priority user info for HA sensor"""
|
||||
try:
|
||||
db = Database()
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
self._send_json({
|
||||
'configured': False,
|
||||
'score': 0,
|
||||
'signals': [],
|
||||
'match_count': 0,
|
||||
})
|
||||
db.close()
|
||||
return
|
||||
|
||||
user = users[0]
|
||||
signals = user.get('signals', '[]')
|
||||
if isinstance(signals, str):
|
||||
import json as json_mod
|
||||
signals = json_mod.loads(signals) if signals else []
|
||||
|
||||
interests = user.get('interests', '[]')
|
||||
if isinstance(interests, str):
|
||||
import json as json_mod
|
||||
interests = json_mod.loads(interests) if interests else []
|
||||
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=100)
|
||||
|
||||
db.close()
|
||||
self._send_json({
|
||||
'configured': True,
|
||||
'name': user.get('name'),
|
||||
'github': user.get('github'),
|
||||
'mastodon': user.get('mastodon'),
|
||||
'reddit': user.get('reddit'),
|
||||
'lobsters': user.get('lobsters'),
|
||||
'matrix': user.get('matrix'),
|
||||
'lemmy': user.get('lemmy'),
|
||||
'discord': user.get('discord'),
|
||||
'bluesky': user.get('bluesky'),
|
||||
'score': user.get('score', 0),
|
||||
'signals': signals[:10],
|
||||
'interests': interests,
|
||||
'location': user.get('location'),
|
||||
'bio': user.get('bio'),
|
||||
'match_count': len(matches),
|
||||
'new_match_count': sum(1 for m in matches if m.get('status') == 'new'),
|
||||
})
|
||||
except Exception as e:
|
||||
self._send_json({'error': str(e)}, 500)
|
||||
|
||||
|
||||
def run_api_server():
|
||||
"""run the API server in a thread"""
|
||||
server = HTTPServer(('0.0.0.0', API_PORT), APIHandler)
|
||||
print(f"connectd api running on port {API_PORT}")
|
||||
server.serve_forever()
|
||||
|
||||
|
||||
def start_api_thread():
|
||||
"""start API server in background thread"""
|
||||
thread = threading.Thread(target=run_api_server, daemon=True)
|
||||
thread.start()
|
||||
return thread
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# standalone mode for testing
|
||||
print(f"starting connectd api on port {API_PORT}...")
|
||||
run_api_server()
|
||||
|
||||
|
||||
# === DASHBOARD ENDPOINTS ===
|
||||
|
||||
DASHBOARD_HTML = """<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>connectd dashboard</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
body {
|
||||
font-family: monospace;
|
||||
background: #0a0a0f;
|
||||
color: #00ffc8;
|
||||
padding: 20px;
|
||||
line-height: 1.6;
|
||||
}
|
||||
h1 { color: #c792ea; margin-bottom: 20px; }
|
||||
h2 { color: #82aaff; margin: 20px 0 10px; border-bottom: 1px solid #333; padding-bottom: 5px; }
|
||||
.stats { display: flex; gap: 20px; flex-wrap: wrap; margin-bottom: 20px; }
|
||||
.stat {
|
||||
background: #1a1a2e;
|
||||
padding: 15px 25px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #333;
|
||||
}
|
||||
.stat-value { font-size: 2em; color: #c792ea; }
|
||||
.stat-label { color: #888; font-size: 0.9em; }
|
||||
.intro-card {
|
||||
background: #1a1a2e;
|
||||
border: 1px solid #333;
|
||||
border-radius: 8px;
|
||||
padding: 15px;
|
||||
margin-bottom: 15px;
|
||||
}
|
||||
.intro-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
margin-bottom: 10px;
|
||||
color: #82aaff;
|
||||
}
|
||||
.intro-score {
|
||||
background: #2a2a4e;
|
||||
padding: 2px 8px;
|
||||
border-radius: 4px;
|
||||
color: #c792ea;
|
||||
}
|
||||
.intro-body {
|
||||
background: #0d0d15;
|
||||
padding: 15px;
|
||||
border-radius: 4px;
|
||||
white-space: pre-wrap;
|
||||
font-size: 0.95em;
|
||||
color: #ddd;
|
||||
}
|
||||
.intro-meta { color: #666; font-size: 0.85em; margin-top: 10px; }
|
||||
.method {
|
||||
display: inline-block;
|
||||
padding: 2px 6px;
|
||||
border-radius: 3px;
|
||||
font-size: 0.85em;
|
||||
}
|
||||
.method-email { background: #2d4a2d; color: #8f8; }
|
||||
.method-mastodon { background: #3d3a5c; color: #c792ea; }
|
||||
.method-github { background: #2d3a4a; color: #82aaff; }
|
||||
.method-manual { background: #4a3a2d; color: #ffa; }
|
||||
.tab-buttons { margin-bottom: 20px; }
|
||||
.tab-btn {
|
||||
background: #1a1a2e;
|
||||
border: 1px solid #333;
|
||||
color: #00ffc8;
|
||||
padding: 10px 20px;
|
||||
cursor: pointer;
|
||||
font-family: monospace;
|
||||
}
|
||||
.tab-btn.active { background: #2a2a4e; border-color: #00ffc8; }
|
||||
.tab-content { display: none; }
|
||||
.tab-content.active { display: block; }
|
||||
.refresh-btn {
|
||||
background: #00ffc8;
|
||||
color: #0a0a0f;
|
||||
border: none;
|
||||
padding: 10px 20px;
|
||||
cursor: pointer;
|
||||
font-family: monospace;
|
||||
font-weight: bold;
|
||||
margin-left: 20px;
|
||||
}
|
||||
.error { color: #ff6b6b; }
|
||||
.success { color: #69ff69; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>connectd <span style="color:#666;font-size:0.6em">dashboard</span></h1>
|
||||
|
||||
<div class="stats" id="stats"></div>
|
||||
|
||||
<div class="tab-buttons">
|
||||
<button class="tab-btn active" onclick="showTab('pending')">pending previews</button>
|
||||
<button class="tab-btn" onclick="showTab('sent')">sent intros</button>
|
||||
<button class="tab-btn" onclick="showTab('failed')">failed</button>
|
||||
<button class="refresh-btn" onclick="loadAll()">refresh</button>
|
||||
</div>
|
||||
|
||||
<div id="pending" class="tab-content active"></div>
|
||||
<div id="sent" class="tab-content"></div>
|
||||
<div id="failed" class="tab-content"></div>
|
||||
|
||||
<script>
|
||||
async function loadStats() {
|
||||
const res = await fetch('/api/stats');
|
||||
const data = await res.json();
|
||||
document.getElementById('stats').innerHTML = `
|
||||
<div class="stat"><div class="stat-value">${data.total_humans}</div><div class="stat-label">humans tracked</div></div>
|
||||
<div class="stat"><div class="stat-value">${data.total_matches}</div><div class="stat-label">total matches</div></div>
|
||||
<div class="stat"><div class="stat-value">${data.sent_intros}</div><div class="stat-label">intros sent</div></div>
|
||||
<div class="stat"><div class="stat-value">${data.high_score_humans}</div><div class="stat-label">high score</div></div>
|
||||
`;
|
||||
}
|
||||
|
||||
async function loadPending() {
|
||||
const res = await fetch('/api/preview_intros?limit=10');
|
||||
const data = await res.json();
|
||||
let html = '<h2>pending intro previews</h2>';
|
||||
if (data.previews) {
|
||||
for (const p of data.previews) {
|
||||
html += `<div class="intro-card">
|
||||
<div class="intro-header">
|
||||
<span>${p.from_platform}:${p.from_user} -> ${p.to_platform}:${p.to_user}</span>
|
||||
<span class="intro-score">score: ${p.score}</span>
|
||||
</div>
|
||||
<div class="intro-body">${p.draft || '[generating...]' }</div>
|
||||
<div class="intro-meta">
|
||||
method: <span class="method method-${p.method}">${p.method}</span>
|
||||
| contact: ${p.contact_info || 'n/a'}
|
||||
| reasons: ${(p.reasons || []).slice(0,2).join(', ') || 'aligned values'}
|
||||
</div>
|
||||
</div>`;
|
||||
}
|
||||
}
|
||||
document.getElementById('pending').innerHTML = html;
|
||||
}
|
||||
|
||||
async function loadSent() {
|
||||
const res = await fetch('/api/sent_intros?limit=20');
|
||||
const data = await res.json();
|
||||
let html = '<h2>sent intros</h2>';
|
||||
if (data.sent) {
|
||||
for (const s of data.sent) {
|
||||
html += `<div class="intro-card">
|
||||
<div class="intro-header">
|
||||
<span>${s.recipient_id}</span>
|
||||
<span class="method method-${s.method}">${s.method}</span>
|
||||
</div>
|
||||
<div class="intro-meta">
|
||||
sent: ${s.timestamp} | score: ${s.overlap_score?.toFixed(0) || '?'}
|
||||
</div>
|
||||
</div>`;
|
||||
}
|
||||
}
|
||||
document.getElementById('sent').innerHTML = html;
|
||||
}
|
||||
|
||||
async function loadFailed() {
|
||||
const res = await fetch('/api/failed_intros');
|
||||
const data = await res.json();
|
||||
let html = '<h2>failed deliveries</h2>';
|
||||
if (data.failed) {
|
||||
for (const f of data.failed) {
|
||||
html += `<div class="intro-card">
|
||||
<div class="intro-header">
|
||||
<span>${f.recipient_id}</span>
|
||||
<span class="method method-${f.method}">${f.method}</span>
|
||||
</div>
|
||||
<div class="intro-meta error">error: ${f.error}</div>
|
||||
</div>`;
|
||||
}
|
||||
}
|
||||
document.getElementById('failed').innerHTML = html;
|
||||
}
|
||||
|
||||
function showTab(name) {
|
||||
document.querySelectorAll('.tab-content').forEach(el => el.classList.remove('active'));
|
||||
document.querySelectorAll('.tab-btn').forEach(el => el.classList.remove('active'));
|
||||
document.getElementById(name).classList.add('active');
|
||||
event.target.classList.add('active');
|
||||
}
|
||||
|
||||
function loadAll() {
|
||||
loadStats();
|
||||
loadPending();
|
||||
loadSent();
|
||||
loadFailed();
|
||||
}
|
||||
|
||||
loadAll();
|
||||
setInterval(loadAll, 30000);
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
|
||||
|
||||
class DashboardMixin:
|
||||
"""mixin to add dashboard endpoints to APIHandler"""
|
||||
|
||||
def _handle_dashboard(self):
|
||||
"""serve the dashboard HTML"""
|
||||
self.send_response(200)
|
||||
self.send_header('Content-Type', 'text/html')
|
||||
self.end_headers()
|
||||
self.wfile.write(DASHBOARD_HTML.encode())
|
||||
|
||||
def _handle_preview_intros(self):
|
||||
"""preview pending intros with draft generation"""
|
||||
import sqlite3
|
||||
import json
|
||||
from introd.groq_draft import draft_intro_with_llm, determine_contact_method
|
||||
|
||||
# parse limit from query string
|
||||
limit = 5
|
||||
if '?' in self.path:
|
||||
query = self.path.split('?')[1]
|
||||
for param in query.split('&'):
|
||||
if param.startswith('limit='):
|
||||
try:
|
||||
limit = int(param.split('=')[1])
|
||||
except:
|
||||
pass
|
||||
|
||||
conn = sqlite3.connect('/data/db/connectd.db')
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""SELECT h1.username, h1.platform, h1.contact, h1.extra,
|
||||
h2.username, h2.platform, h2.contact, h2.extra,
|
||||
m.overlap_score, m.overlap_reasons
|
||||
FROM matches m
|
||||
JOIN humans h1 ON m.human_a_id = h1.id
|
||||
JOIN humans h2 ON m.human_b_id = h2.id
|
||||
WHERE m.status = 'pending' AND m.overlap_score >= 60
|
||||
ORDER BY m.overlap_score DESC
|
||||
LIMIT ?""", (limit,))
|
||||
|
||||
previews = []
|
||||
for row in c.fetchall():
|
||||
human_a = {
|
||||
'username': row[0], 'platform': row[1],
|
||||
'contact': json.loads(row[2]) if row[2] else {},
|
||||
'extra': json.loads(row[3]) if row[3] else {}
|
||||
}
|
||||
human_b = {
|
||||
'username': row[4], 'platform': row[5],
|
||||
'contact': json.loads(row[6]) if row[6] else {},
|
||||
'extra': json.loads(row[7]) if row[7] else {}
|
||||
}
|
||||
reasons = json.loads(row[9]) if row[9] else []
|
||||
|
||||
match_data = {
|
||||
'human_a': human_a, 'human_b': human_b,
|
||||
'overlap_score': row[8], 'overlap_reasons': reasons
|
||||
}
|
||||
|
||||
# determine contact method
|
||||
method, contact_info = determine_contact_method(human_a)
|
||||
|
||||
# generate draft (skip if too slow)
|
||||
draft = None
|
||||
try:
|
||||
result, _ = draft_intro_with_llm(match_data, recipient='a', dry_run=True)
|
||||
if result:
|
||||
draft = result.get('draft')
|
||||
except:
|
||||
pass
|
||||
|
||||
previews.append({
|
||||
'from_platform': human_b['platform'],
|
||||
'from_user': human_b['username'],
|
||||
'to_platform': human_a['platform'],
|
||||
'to_user': human_a['username'],
|
||||
'score': int(row[8]),
|
||||
'reasons': reasons[:3],
|
||||
'method': method,
|
||||
'contact_info': str(contact_info) if contact_info else None,
|
||||
'draft': draft
|
||||
})
|
||||
|
||||
conn.close()
|
||||
self._send_json({'previews': previews})
|
||||
|
||||
def _handle_sent_intros(self):
|
||||
"""return sent intro history from delivery log"""
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
limit = 20
|
||||
if '?' in self.path:
|
||||
query = self.path.split('?')[1]
|
||||
for param in query.split('&'):
|
||||
if param.startswith('limit='):
|
||||
try:
|
||||
limit = int(param.split('=')[1])
|
||||
except:
|
||||
pass
|
||||
|
||||
log_path = Path('/app/data/delivery_log.json')
|
||||
if log_path.exists():
|
||||
with open(log_path) as f:
|
||||
log = json.load(f)
|
||||
sent = log.get('sent', [])[-limit:]
|
||||
sent.reverse() # newest first
|
||||
else:
|
||||
sent = []
|
||||
|
||||
self._send_json({'sent': sent})
|
||||
|
||||
def _handle_failed_intros(self):
|
||||
"""return failed delivery attempts"""
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
log_path = Path('/app/data/delivery_log.json')
|
||||
if log_path.exists():
|
||||
with open(log_path) as f:
|
||||
log = json.load(f)
|
||||
failed = log.get('failed', [])
|
||||
else:
|
||||
failed = []
|
||||
|
||||
self._send_json({'failed': failed})
|
||||
0
backups/data_20251215_194141/.gitkeep
Normal file
0
backups/data_20251215_194141/.gitkeep
Normal file
137
backups/data_20251215_194141/delivery_log.json
Normal file
137
backups/data_20251215_194141/delivery_log.json
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
{
|
||||
"sent": [
|
||||
{
|
||||
"recipient_id": "github:dwmw2",
|
||||
"recipient_name": "David Woodhouse",
|
||||
"method": "email",
|
||||
"contact_info": "dwmw2@infradead.org",
|
||||
"overlap_score": 172.01631023799695,
|
||||
"timestamp": "2025-12-15T23:14:45.542509",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:pvizeli",
|
||||
"recipient_name": "Pascal Vizeli",
|
||||
"method": "email",
|
||||
"contact_info": "pascal.vizeli@syshack.ch",
|
||||
"overlap_score": 163.33333333333331,
|
||||
"timestamp": "2025-12-15T23:14:48.462716",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:2234839",
|
||||
"recipient_name": "\u5d2e\u751f",
|
||||
"method": "email",
|
||||
"contact_info": "admin@shenzilong.cn",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.749442",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:zomars",
|
||||
"recipient_name": "Omar L\u00f3pez",
|
||||
"method": "email",
|
||||
"contact_info": "zomars@me.com",
|
||||
"overlap_score": 138.9593178751708,
|
||||
"timestamp": "2025-12-16T00:39:43.266181",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:59:21.763092",
|
||||
"success": true,
|
||||
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533401043321"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:59:22.199945",
|
||||
"success": true,
|
||||
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533505124538"
|
||||
}
|
||||
],
|
||||
"failed": [
|
||||
{
|
||||
"recipient_id": "github:joyeusenoelle",
|
||||
"recipient_name": "No\u00eblle Anthony",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@noelle@chat.noelle.codes",
|
||||
"overlap_score": 65,
|
||||
"timestamp": "2025-12-14T23:44:17.215796",
|
||||
"success": false,
|
||||
"error": "MASTODON_TOKEN not set"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:balloob",
|
||||
"recipient_name": "Paulus Schoutsen",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@home_assistant@youtube.com",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.155178",
|
||||
"success": false,
|
||||
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:balloob",
|
||||
"recipient_name": "Paulus Schoutsen",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@home_assistant@youtube.com",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.334902",
|
||||
"success": false,
|
||||
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:53:25.848601",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e05e490>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:53:55.912872",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e07b1d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:54:25.947404",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e0986d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:54:55.982839",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794de9dd90>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
}
|
||||
],
|
||||
"queued": []
|
||||
}
|
||||
371
backups/data_20251215_194141/manual_queue.json
Normal file
371
backups/data_20251215_194141/manual_queue.json
Normal file
|
|
@ -0,0 +1,371 @@
|
|||
[
|
||||
{
|
||||
"platform": "reddit",
|
||||
"username": "julietcam84",
|
||||
"url": "https://reddit.com/u/julietcam84",
|
||||
"score": 195,
|
||||
"subreddits": [
|
||||
"cooperatives",
|
||||
"intentionalcommunity"
|
||||
],
|
||||
"signals": [
|
||||
"cooperative",
|
||||
"community",
|
||||
"intentional_community",
|
||||
"remote"
|
||||
],
|
||||
"reasons": [
|
||||
"active in: cooperatives, intentionalcommunity",
|
||||
"signals: cooperative, community, intentional_community, remote",
|
||||
"REDDIT-ONLY: needs manual review for outreach"
|
||||
],
|
||||
"note": "reddit-only user - no external links found. DM manually if promising.",
|
||||
"queued_at": "2025-12-15T09:06:32.705954",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"platform": "reddit",
|
||||
"username": "MasterRoshi1620",
|
||||
"url": "https://reddit.com/u/MasterRoshi1620",
|
||||
"score": 159,
|
||||
"subreddits": [
|
||||
"selfhosted",
|
||||
"homelab"
|
||||
],
|
||||
"signals": [
|
||||
"unix",
|
||||
"privacy",
|
||||
"selfhosted",
|
||||
"modern_lang",
|
||||
"containers"
|
||||
],
|
||||
"reasons": [
|
||||
"active in: selfhosted, homelab",
|
||||
"signals: unix, privacy, selfhosted, modern_lang, containers",
|
||||
"REDDIT-ONLY: needs manual review for outreach"
|
||||
],
|
||||
"note": "reddit-only user - no external links found. DM manually if promising.",
|
||||
"queued_at": "2025-12-15T22:54:56.414100",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 2779,
|
||||
"human_a": {
|
||||
"id": 642,
|
||||
"username": "qcasey",
|
||||
"platform": "github",
|
||||
"name": "Quinn Casey",
|
||||
"url": "https://github.com/qcasey",
|
||||
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 91,
|
||||
"username": "mib1185",
|
||||
"platform": "github",
|
||||
"name": "Michael",
|
||||
"url": "https://github.com/mib1185",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"hireable\": null, \"handles\": {\"github\": \"ansible\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:08:57.297790\"}"
|
||||
},
|
||||
"overlap_score": 185.0,
|
||||
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, home_automation, privacy\", \"both remote-friendly\", \"complementary skills: Kotlin, C++, Jinja, Ruby, CSS\"]"
|
||||
},
|
||||
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nMichael is building: using Python, HTML, TypeScript | (85 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, federated_chat, home_automation, privacy | both remote-friendly | complementary skills: Kotlin, C++, Jinja, Ruby, CSS\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/mib1185\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 91,
|
||||
"username": "mib1185",
|
||||
"platform": "github",
|
||||
"name": "Michael",
|
||||
"url": "https://github.com/mib1185",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 27, \"HTML\": 2, \"TypeScript\": 1, \"Dockerfile\": 2, \"Shell\": 8, \"JavaScript\": 1, \"Jinja\": 2, \"PHP\": 1, \"Go\": 1}, \"repo_count\": 85, \"total_stars\": 136, \"hireable\": null, \"handles\": {\"github\": \"ansible\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:08:57.297790\"}"
|
||||
},
|
||||
"queued_at": "2025-12-15T23:14:45.528184",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 2795,
|
||||
"human_a": {
|
||||
"id": 642,
|
||||
"username": "qcasey",
|
||||
"platform": "github",
|
||||
"name": "Quinn Casey",
|
||||
"url": "https://github.com/qcasey",
|
||||
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 110,
|
||||
"username": "RoboMagus",
|
||||
"platform": "github",
|
||||
"name": null,
|
||||
"url": "https://github.com/RoboMagus",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:09:50.629088\"}"
|
||||
},
|
||||
"overlap_score": 173.03582460328593,
|
||||
"overlap_reasons": "[\"shared values: unix, foss, home_automation, privacy, community\", \"both remote-friendly\", \"complementary skills: Less, Ruby, CSS, Dart, PHP\"]"
|
||||
},
|
||||
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nRoboMagus is building: using Python, Vue, HTML | (86 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, home_automation, privacy, community | both remote-friendly | complementary skills: Less, Ruby, CSS, Dart, PHP\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/RoboMagus\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 110,
|
||||
"username": "RoboMagus",
|
||||
"platform": "github",
|
||||
"name": null,
|
||||
"url": "https://github.com/RoboMagus",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"Vue\": 3, \"HTML\": 1, \"JavaScript\": 11, \"C++\": 7, \"TypeScript\": 6, \"Go\": 3, \"Kotlin\": 1, \"Shell\": 4, \"Dockerfile\": 2, \"C\": 1, \"Less\": 1}, \"repo_count\": 86, \"total_stars\": 77, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:09:50.629088\"}"
|
||||
},
|
||||
"queued_at": "2025-12-15T23:14:45.535258",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 2768,
|
||||
"human_a": {
|
||||
"id": 642,
|
||||
"username": "qcasey",
|
||||
"platform": "github",
|
||||
"name": "Quinn Casey",
|
||||
"url": "https://github.com/qcasey",
|
||||
"contact": "{\"email\": \"github@letterq.org\", \"emails\": [\"github@letterq.org\", \"134208@letterq.org\", \"ceo@business.net\", \"career@letterq.org\", \"recruitmentspam@letterq.org\"], \"blog\": \"https://quinncasey.com\", \"twitter\": null, \"mastodon\": \"@678995876047487016@discord.com\", \"bluesky\": \"quinncasey.com\", \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 12, \"Python\": 21, \"Go\": 15, \"TypeScript\": 4, \"Svelte\": 1, \"Rust\": 1, \"Kotlin\": 2, \"HTML\": 1, \"CSS\": 2, \"C\": 1, \"Dart\": 2, \"Ruby\": 1, \"C++\": 2, \"Dockerfile\": 1, \"Java\": 1, \"Shell\": 1, \"PHP\": 1, \"AppleScript\": 1}, \"repo_count\": 100, \"total_stars\": 324, \"hireable\": true, \"handles\": {\"github\": \"qcasey\", \"telegram\": \"@qcasey\", \"bluesky\": \"quinncasey.com\", \"mastodon\": \"@678995876047487016@discord.com\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:43:55.251547\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 415,
|
||||
"username": "sbilly",
|
||||
"platform": "github",
|
||||
"name": "sbilly",
|
||||
"url": "https://github.com/sbilly",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"http://sbilly.com/\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"community\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"extra\": {\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:29:03.191201\"}"
|
||||
},
|
||||
"overlap_score": 170.3406027914858,
|
||||
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, home_automation, privacy\", \"both remote-friendly\", \"complementary skills: Kotlin, Clojure, Scala, Objective-C, Dart\"]"
|
||||
},
|
||||
"draft": "hi Quinn,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using JavaScript, Python, Go | (100 repos) | interested in foss, home_automation, privacy\n\nsbilly is building: working on mesh-network | using Go, Shell, Dockerfile | (100 repos) | interested in foss, home_automation, privacy\n\noverlap: shared values: unix, foss, federated_chat, home_automation, privacy | both remote-friendly | complementary skills: Kotlin, Clojure, Scala, Objective-C, Dart\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/sbilly\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 415,
|
||||
"username": "sbilly",
|
||||
"platform": "github",
|
||||
"name": "sbilly",
|
||||
"url": "https://github.com/sbilly",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"http://sbilly.com/\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"community\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"extra\": {\"topics\": [\"mesh-network\"], \"languages\": {\"Go\": 4, \"Shell\": 4, \"Dockerfile\": 1, \"Python\": 12, \"JavaScript\": 14, \"Java\": 3, \"Ruby\": 3, \"CSS\": 3, \"C++\": 6, \"CoffeeScript\": 1, \"Scala\": 2, \"HTML\": 5, \"Vue\": 1, \"Clojure\": 1, \"PHP\": 3, \"TypeScript\": 1, \"C\": 8, \"Assembly\": 2, \"Objective-C\": 1, \"C#\": 1}, \"repo_count\": 100, \"total_stars\": 14354, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:29:03.191201\"}"
|
||||
},
|
||||
"queued_at": "2025-12-15T23:14:48.455001",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 10793,
|
||||
"human_a": {
|
||||
"id": 526,
|
||||
"username": "2234839",
|
||||
"platform": "github",
|
||||
"name": "\u5d2e\u751f",
|
||||
"url": "https://github.com/2234839",
|
||||
"contact": "{\"email\": \"admin@shenzilong.cn\", \"emails\": [\"admin@shenzilong.cn\"], \"blog\": \"https://shenzilong.cn\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 54, \"Vue\": 7, \"JavaScript\": 12, \"Rust\": 1, \"CSS\": 3, \"Go\": 1, \"Ruby\": 1, \"HTML\": 1, \"Svelte\": 2}, \"repo_count\": 100, \"total_stars\": 528, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 54, \"Vue\": 7, \"JavaScript\": 12, \"Rust\": 1, \"CSS\": 3, \"Go\": 1, \"Ruby\": 1, \"HTML\": 1, \"Svelte\": 2}, \"repo_count\": 100, \"total_stars\": 528, \"hireable\": null, \"handles\": {\"github\": \"2234839\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:37:19.731768\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 212,
|
||||
"username": "uhthomas",
|
||||
"platform": "github",
|
||||
"name": "Thomas",
|
||||
"url": "https://github.com/uhthomas",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
|
||||
},
|
||||
"overlap_score": 152.39313358485379,
|
||||
"overlap_reasons": "[\"shared values: unix, community, foss, selfhosted, modern_lang\", \"both remote-friendly\", \"complementary skills: Python, HTML, Ruby, CSS, CUE\"]"
|
||||
},
|
||||
"draft": "hi \u5d2e\u751f,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using TypeScript, Vue, JavaScript | (100 repos) | interested in foss, privacy, selfhosted\n\nThomas is building: using CUE, Dockerfile, Go | (100 repos) | interested in foss, selfhosted\n\noverlap: shared values: unix, community, foss, selfhosted, modern_lang | both remote-friendly | complementary skills: Python, HTML, Ruby, CSS, CUE\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/uhthomas\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 212,
|
||||
"username": "uhthomas",
|
||||
"platform": "github",
|
||||
"name": "Thomas",
|
||||
"url": "https://github.com/uhthomas",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
|
||||
},
|
||||
"queued_at": "2025-12-16T00:33:56.913113",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 3924,
|
||||
"human_a": {
|
||||
"id": 777,
|
||||
"username": "joshuaboniface",
|
||||
"platform": "github",
|
||||
"name": "Joshua M. Boniface",
|
||||
"url": "https://github.com/joshuaboniface",
|
||||
"contact": "{\"email\": \"joshua@boniface.me\", \"emails\": [\"joshua@boniface.me\"], \"blog\": \"https://www.boniface.me\", \"twitter\": null, \"mastodon\": \"@joshuaboniface@www.youtube.com\", \"bluesky\": null, \"matrix\": null, \"lemmy\": \"@djbon2112@old.reddit.com\"}",
|
||||
"signals": "[\"unix\", \"foss\", \"federated_chat\", \"home_automation\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Python\": 17, \"C#\": 13, \"JavaScript\": 5, \"SCSS\": 1, \"Go\": 1, \"HTML\": 2, \"Shell\": 4, \"C++\": 2, \"Java\": 3}, \"repo_count\": 96, \"total_stars\": 1157, \"extra\": {\"topics\": [], \"languages\": {\"Python\": 17, \"C#\": 13, \"JavaScript\": 5, \"SCSS\": 1, \"Go\": 1, \"HTML\": 2, \"Shell\": 4, \"C++\": 2, \"Java\": 3}, \"repo_count\": 96, \"total_stars\": 1157, \"hireable\": null, \"handles\": {\"github\": \"joshuaboniface\", \"linkedin\": \"joshuamboniface\", \"mastodon\": \"@joshuaboniface@www.youtube.com\", \"lemmy\": \"@djbon2112@old.reddit.com\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:52:45.963017\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 228,
|
||||
"username": "mintsoft",
|
||||
"platform": "github",
|
||||
"name": "Rob Emery",
|
||||
"url": "https://github.com/mintsoft",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"extra\": {\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:17:08.966748\"}"
|
||||
},
|
||||
"overlap_score": 149.1843240344525,
|
||||
"overlap_reasons": "[\"shared values: unix, foss, privacy, decentralized, selfhosted\", \"both remote-friendly\", \"complementary skills: Kotlin, Makefile, PHP, Dart, SCSS\"]"
|
||||
},
|
||||
"draft": "hi Joshua,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using Python, C#, JavaScript | (96 repos) | interested in foss, home_automation, privacy\n\nRob is building: using Kotlin, Go, Python | (100 repos) | interested in foss, privacy, selfhosted\n\noverlap: shared values: unix, foss, privacy, decentralized, selfhosted | both remote-friendly | complementary skills: Kotlin, Makefile, PHP, Dart, SCSS\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/mintsoft\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 228,
|
||||
"username": "mintsoft",
|
||||
"platform": "github",
|
||||
"name": "Rob Emery",
|
||||
"url": "https://github.com/mintsoft",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"privacy\", \"decentralized\", \"selfhosted\", \"modern_lang\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"extra\": {\"topics\": [], \"languages\": {\"Kotlin\": 4, \"Go\": 5, \"Python\": 7, \"C\": 2, \"Shell\": 3, \"Dart\": 1, \"Java\": 2, \"C#\": 1, \"PHP\": 2, \"C++\": 7, \"JavaScript\": 5, \"Perl\": 2, \"Makefile\": 1, \"HTML\": 1, \"PowerShell\": 1}, \"repo_count\": 100, \"total_stars\": 33, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:17:08.966748\"}"
|
||||
},
|
||||
"queued_at": "2025-12-16T00:33:56.920505",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 13072,
|
||||
"human_a": {
|
||||
"id": 212,
|
||||
"username": "uhthomas",
|
||||
"platform": "github",
|
||||
"name": "Thomas",
|
||||
"url": "https://github.com/uhthomas",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"6f.io\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"extra\": {\"topics\": [], \"languages\": {\"CUE\": 1, \"Dockerfile\": 4, \"Go\": 27, \"Starlark\": 10, \"Rust\": 2, \"Lua\": 1, \"JavaScript\": 3, \"Dart\": 1, \"Python\": 1, \"TypeScript\": 1}, \"repo_count\": 100, \"total_stars\": 138, \"hireable\": true, \"handles\": {\"github\": \"uhthomas\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:16:14.638950\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 96,
|
||||
"username": "SlyBouhafs",
|
||||
"platform": "github",
|
||||
"name": "Sly",
|
||||
"url": "https://github.com/SlyBouhafs",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"hireable\": true, \"handles\": {}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:09:12.423838\"}"
|
||||
},
|
||||
"overlap_score": 142.37165974941587,
|
||||
"overlap_reasons": "[\"shared values: unix, community, foss, selfhosted, modern_lang\", \"both remote-friendly\", \"complementary skills: Go, HTML, CUE, Dart, Makefile\"]"
|
||||
},
|
||||
"draft": "hi Thomas,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using CUE, Dockerfile, Go | (100 repos) | interested in foss, selfhosted\n\nSly is building: using JavaScript, Makefile, Python | (29 repos) | interested in foss, selfhosted\n\noverlap: shared values: unix, community, foss, selfhosted, modern_lang | both remote-friendly | complementary skills: Go, HTML, CUE, Dart, Makefile\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/SlyBouhafs\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 96,
|
||||
"username": "SlyBouhafs",
|
||||
"platform": "github",
|
||||
"name": "Sly",
|
||||
"url": "https://github.com/SlyBouhafs",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"selfhosted\", \"modern_lang\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"extra\": {\"topics\": [], \"languages\": {\"JavaScript\": 9, \"Makefile\": 1, \"Python\": 2, \"HTML\": 2, \"TypeScript\": 3, \"Lua\": 1, \"Vim script\": 1}, \"repo_count\": 29, \"total_stars\": 23, \"hireable\": true, \"handles\": {}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:09:12.423838\"}"
|
||||
},
|
||||
"queued_at": "2025-12-16T00:33:56.930693",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 12980,
|
||||
"human_a": {
|
||||
"id": 775,
|
||||
"username": "CarlSchwan",
|
||||
"platform": "github",
|
||||
"name": "Carl Schwan",
|
||||
"url": "https://github.com/CarlSchwan",
|
||||
"contact": "{\"email\": \"carlschwan@kde.org\", \"emails\": [\"carlschwan@kde.org\", \"carl@carlschwan.eu\"], \"blog\": \"https://carlschwan.eu\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"C++\": 3, \"Shell\": 3, \"Lua\": 2, \"PHP\": 3, \"QML\": 1, \"CSS\": 1}, \"repo_count\": 100, \"total_stars\": 20, \"extra\": {\"topics\": [], \"languages\": {\"C++\": 3, \"Shell\": 3, \"Lua\": 2, \"PHP\": 3, \"QML\": 1, \"CSS\": 1}, \"repo_count\": 100, \"total_stars\": 20, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:52:38.446226\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 665,
|
||||
"username": "TCOTC",
|
||||
"platform": "github",
|
||||
"name": "Jeffrey Chen",
|
||||
"url": "https://github.com/TCOTC",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"hireable\": null, \"handles\": {\"github\": \"siyuan-note\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:45:24.008492\"}"
|
||||
},
|
||||
"overlap_score": 135.0,
|
||||
"overlap_reasons": "[\"shared values: unix, foss, federated_chat, privacy, community\", \"complementary skills: Python, Kotlin, Ruby, JavaScript, QML\"]"
|
||||
},
|
||||
"draft": "hi Carl,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: using C++, Shell, Lua | (100 repos) | interested in foss, privacy, selfhosted\n\nJeffrey is building: using TypeScript, JavaScript, SCSS | (100 repos) | interested in foss, privacy, selfhosted\n\noverlap: shared values: unix, foss, federated_chat, privacy, community | complementary skills: Python, Kotlin, Ruby, JavaScript, QML\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/TCOTC\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 665,
|
||||
"username": "TCOTC",
|
||||
"platform": "github",
|
||||
"name": "Jeffrey Chen",
|
||||
"url": "https://github.com/TCOTC",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"unix\", \"community\", \"foss\", \"federated_chat\", \"privacy\", \"selfhosted\", \"modern_lang\", \"containers\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"extra\": {\"topics\": [], \"languages\": {\"TypeScript\": 14, \"JavaScript\": 11, \"SCSS\": 2, \"Go\": 8, \"Kotlin\": 13, \"HTML\": 3, \"Python\": 10, \"C++\": 7, \"Java\": 5, \"PHP\": 2, \"Rust\": 2, \"Vue\": 4, \"C#\": 4, \"Shell\": 2, \"Swift\": 1, \"Ruby\": 1, \"Dart\": 1, \"Svelte\": 1}, \"repo_count\": 100, \"total_stars\": 19, \"hireable\": null, \"handles\": {\"github\": \"siyuan-note\"}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:45:24.008492\"}"
|
||||
},
|
||||
"queued_at": "2025-12-16T00:59:33.606115",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"match": {
|
||||
"id": 12457,
|
||||
"human_a": {
|
||||
"id": 171,
|
||||
"username": "louislam",
|
||||
"platform": "github",
|
||||
"name": "Louis Lam",
|
||||
"url": "https://github.com/louislam",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": \"louislam\", \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"remote\"]",
|
||||
"extra": "{\"topics\": [\"self-hosted\"], \"languages\": {\"JavaScript\": 10, \"PHP\": 6, \"TypeScript\": 11, \"HTML\": 1, \"Vue\": 1, \"Shell\": 3, \"Java\": 2, \"C#\": 2, \"Hack\": 1, \"Kotlin\": 1, \"Dockerfile\": 2, \"PLpgSQL\": 1, \"CSS\": 2, \"Smarty\": 1, \"Visual Basic\": 1}, \"repo_count\": 56, \"total_stars\": 101905, \"extra\": {\"topics\": [\"self-hosted\"], \"languages\": {\"JavaScript\": 10, \"PHP\": 6, \"TypeScript\": 11, \"HTML\": 1, \"Vue\": 1, \"Shell\": 3, \"Java\": 2, \"C#\": 2, \"Hack\": 1, \"Kotlin\": 1, \"Dockerfile\": 2, \"PLpgSQL\": 1, \"CSS\": 2, \"Smarty\": 1, \"Visual Basic\": 1}, \"repo_count\": 56, \"total_stars\": 101905, \"hireable\": true, \"handles\": {\"twitter\": \"@louislam\", \"github\": \"louislam\"}}, \"hireable\": true, \"scraped_at\": \"2025-12-15T22:13:56.492647\"}"
|
||||
},
|
||||
"human_b": {
|
||||
"id": 364,
|
||||
"username": "anokfireball",
|
||||
"platform": "github",
|
||||
"name": "Fabian Koller",
|
||||
"url": "https://github.com/anokfireball",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"p2p\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"extra\": {\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:25:55.690643\"}"
|
||||
},
|
||||
"overlap_score": 129.61885790097358,
|
||||
"overlap_reasons": "[\"shared values: foss, selfhosted, modern_lang, containers, remote\", \"both remote-friendly\", \"complementary skills: Python, Kotlin, HCL, Jinja, C++\"]"
|
||||
},
|
||||
"draft": "hi Louis,\n\ni'm an AI that connects isolated builders working on similar things.\n\nyou're building: working on self-hosted | using JavaScript, PHP, TypeScript | (56 repos) | interested in foss, selfhosted\n\nFabian is building: using Jinja, HCL, Shell | (22 repos) | interested in foss, selfhosted\n\noverlap: shared values: foss, selfhosted, modern_lang, containers, remote | both remote-friendly | complementary skills: Python, Kotlin, HCL, Jinja, C++\n\nthought you might benefit from knowing each other.\n\ntheir work: https://github.com/anokfireball\n\nno pitch. just connection. ignore if not useful.\n\n- connectd\n",
|
||||
"recipient": {
|
||||
"id": 364,
|
||||
"username": "anokfireball",
|
||||
"platform": "github",
|
||||
"name": "Fabian Koller",
|
||||
"url": "https://github.com/anokfireball",
|
||||
"contact": "{\"email\": null, \"emails\": [], \"blog\": \"\", \"twitter\": null, \"mastodon\": null, \"bluesky\": null, \"matrix\": null, \"lemmy\": null}",
|
||||
"signals": "[\"foss\", \"selfhosted\", \"modern_lang\", \"containers\", \"p2p\", \"remote\"]",
|
||||
"extra": "{\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"extra\": {\"topics\": [], \"languages\": {\"Jinja\": 1, \"HCL\": 1, \"Shell\": 1, \"Python\": 4, \"Dockerfile\": 1, \"C\": 2, \"Lua\": 1, \"Go\": 1, \"C++\": 2}, \"repo_count\": 22, \"total_stars\": 13, \"hireable\": null, \"handles\": {}}, \"hireable\": null, \"scraped_at\": \"2025-12-15T22:25:55.690643\"}"
|
||||
},
|
||||
"queued_at": "2025-12-16T01:12:40.906296",
|
||||
"status": "pending"
|
||||
}
|
||||
]
|
||||
82
backups/data_20251215_194141/org_cache.json
Normal file
82
backups/data_20251215_194141/org_cache.json
Normal file
|
|
@ -0,0 +1,82 @@
|
|||
{
|
||||
"users": {
|
||||
"testuser": [
|
||||
"home-assistant",
|
||||
"esphome"
|
||||
],
|
||||
"sudoxnym": [],
|
||||
"joyeusenoelle": [],
|
||||
"sbilly": [
|
||||
"awesome-security"
|
||||
],
|
||||
"turt2live": [
|
||||
"matrix-org",
|
||||
"element-hq",
|
||||
"ENTS-Source",
|
||||
"IETF-Hackathon",
|
||||
"t2bot"
|
||||
],
|
||||
"balloob": [
|
||||
"home-assistant",
|
||||
"hassio-addons",
|
||||
"NabuCasa",
|
||||
"esphome",
|
||||
"OpenHomeFoundation"
|
||||
],
|
||||
"anikdhabal": [],
|
||||
"fabaff": [
|
||||
"NixOS",
|
||||
"home-assistant",
|
||||
"affolter-engineering",
|
||||
"esphome",
|
||||
"home-assistant-ecosystem"
|
||||
],
|
||||
"uhthomas": [
|
||||
"wiz-sec"
|
||||
],
|
||||
"emontnemery": [],
|
||||
"Stradex": [],
|
||||
"Tribler": [],
|
||||
"bdraco": [
|
||||
"CpanelInc",
|
||||
"aio-libs",
|
||||
"home-assistant",
|
||||
"esphome",
|
||||
"python-kasa",
|
||||
"home-assistant-libs",
|
||||
"Bluetooth-Devices",
|
||||
"python-zeroconf",
|
||||
"pyenphase",
|
||||
"ESPHome-RATGDO",
|
||||
"ratgdo",
|
||||
"OpenHomeFoundation",
|
||||
"uilibs",
|
||||
"sblibs",
|
||||
"openvideolibs",
|
||||
"Harmony-Libs",
|
||||
"lightinglibs",
|
||||
"kohlerlibs",
|
||||
"open-home-foundation-maintainers",
|
||||
"Yale-Libs",
|
||||
"Solarlibs",
|
||||
"esphome-libs"
|
||||
],
|
||||
"ArchiveBox": []
|
||||
},
|
||||
"updated": {
|
||||
"testuser": "2025-12-14T22:44:28.772479",
|
||||
"sudoxnym": "2025-12-14T22:51:13.523581",
|
||||
"joyeusenoelle": "2025-12-14T23:19:46.135417",
|
||||
"sbilly": "2025-12-14T23:19:55.813111",
|
||||
"turt2live": "2025-12-14T23:20:04.266843",
|
||||
"balloob": "2025-12-14T23:20:20.527129",
|
||||
"anikdhabal": "2025-12-14T23:20:32.904717",
|
||||
"fabaff": "2025-12-14T23:20:39.889442",
|
||||
"uhthomas": "2025-12-14T23:20:59.048667",
|
||||
"emontnemery": "2025-12-14T23:21:06.590806",
|
||||
"Stradex": "2025-12-14T23:21:14.490327",
|
||||
"Tribler": "2025-12-14T23:21:24.234634",
|
||||
"bdraco": "2025-12-14T23:26:12.662456",
|
||||
"ArchiveBox": "2025-12-14T23:26:32.513637"
|
||||
}
|
||||
}
|
||||
137
backups/delivery_log_20251215_194141.json
Normal file
137
backups/delivery_log_20251215_194141.json
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
{
|
||||
"sent": [
|
||||
{
|
||||
"recipient_id": "github:dwmw2",
|
||||
"recipient_name": "David Woodhouse",
|
||||
"method": "email",
|
||||
"contact_info": "dwmw2@infradead.org",
|
||||
"overlap_score": 172.01631023799695,
|
||||
"timestamp": "2025-12-15T23:14:45.542509",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:pvizeli",
|
||||
"recipient_name": "Pascal Vizeli",
|
||||
"method": "email",
|
||||
"contact_info": "pascal.vizeli@syshack.ch",
|
||||
"overlap_score": 163.33333333333331,
|
||||
"timestamp": "2025-12-15T23:14:48.462716",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:2234839",
|
||||
"recipient_name": "\u5d2e\u751f",
|
||||
"method": "email",
|
||||
"contact_info": "admin@shenzilong.cn",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.749442",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:zomars",
|
||||
"recipient_name": "Omar L\u00f3pez",
|
||||
"method": "email",
|
||||
"contact_info": "zomars@me.com",
|
||||
"overlap_score": 138.9593178751708,
|
||||
"timestamp": "2025-12-16T00:39:43.266181",
|
||||
"success": true,
|
||||
"error": null
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:59:21.763092",
|
||||
"success": true,
|
||||
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533401043321"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:59:22.199945",
|
||||
"success": true,
|
||||
"error": "https://mastodon.sudoxreboot.com/@connectd/115726533505124538"
|
||||
}
|
||||
],
|
||||
"failed": [
|
||||
{
|
||||
"recipient_id": "github:joyeusenoelle",
|
||||
"recipient_name": "No\u00eblle Anthony",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@noelle@chat.noelle.codes",
|
||||
"overlap_score": 65,
|
||||
"timestamp": "2025-12-14T23:44:17.215796",
|
||||
"success": false,
|
||||
"error": "MASTODON_TOKEN not set"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:balloob",
|
||||
"recipient_name": "Paulus Schoutsen",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@home_assistant@youtube.com",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.155178",
|
||||
"success": false,
|
||||
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:balloob",
|
||||
"recipient_name": "Paulus Schoutsen",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@home_assistant@youtube.com",
|
||||
"overlap_score": 163.09442000261095,
|
||||
"timestamp": "2025-12-15T23:14:50.334902",
|
||||
"success": false,
|
||||
"error": "mastodon api error: 401 - {\"error\":\"The access token is invalid\"}"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:53:25.848601",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e05e490>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:joshuaboniface",
|
||||
"recipient_name": "Joshua M. Boniface",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@joshuaboniface@www.youtube.com",
|
||||
"overlap_score": 136.06304901929022,
|
||||
"timestamp": "2025-12-16T00:53:55.912872",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e07b1d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:54:25.947404",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794e0986d0>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
},
|
||||
{
|
||||
"recipient_id": "github:dariusk",
|
||||
"recipient_name": "Darius Kazemi",
|
||||
"method": "mastodon",
|
||||
"contact_info": "@darius@friend.camp",
|
||||
"overlap_score": 135.39490109778416,
|
||||
"timestamp": "2025-12-16T00:54:55.982839",
|
||||
"success": false,
|
||||
"error": "HTTPSConnectionPool(host='mastodon.sudoxreboot.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by ConnectTimeoutError(<HTTPSConnection(host='mastodon.sudoxreboot.com', port=443) at 0x7f794de9dd90>, 'Connection to mastodon.sudoxreboot.com timed out. (connect timeout=30)'))"
|
||||
}
|
||||
],
|
||||
"queued": []
|
||||
}
|
||||
204
central_client.py
Normal file
204
central_client.py
Normal file
|
|
@ -0,0 +1,204 @@
|
|||
"""
|
||||
connectd/central_client.py - client for connectd-central API
|
||||
|
||||
provides similar interface to local Database class but uses remote API.
|
||||
allows distributed instances to share data and coordinate outreach.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import requests
|
||||
from typing import Optional, List, Dict, Any, Tuple
|
||||
from datetime import datetime
|
||||
|
||||
CENTRAL_API = os.environ.get('CONNECTD_CENTRAL_API', '')
|
||||
API_KEY = os.environ.get('CONNECTD_API_KEY', '')
|
||||
INSTANCE_ID = os.environ.get('CONNECTD_INSTANCE_ID', 'default')
|
||||
|
||||
|
||||
class CentralClient:
|
||||
"""client for connectd-central API"""
|
||||
|
||||
def __init__(self, api_url: str = None, api_key: str = None, instance_id: str = None):
|
||||
self.api_url = api_url or CENTRAL_API
|
||||
self.api_key = api_key or API_KEY
|
||||
self.instance_id = instance_id or INSTANCE_ID
|
||||
self.headers = {
|
||||
'X-API-Key': self.api_key,
|
||||
'Content-Type': 'application/json'
|
||||
}
|
||||
|
||||
if not self.api_key:
|
||||
raise ValueError('CONNECTD_API_KEY environment variable required')
|
||||
|
||||
def _get(self, endpoint: str, params: dict = None) -> dict:
|
||||
resp = requests.get(f'{self.api_url}{endpoint}', headers=self.headers, params=params)
|
||||
resp.raise_for_status()
|
||||
return resp.json()
|
||||
|
||||
def _post(self, endpoint: str, data: dict) -> dict:
|
||||
resp = requests.post(f'{self.api_url}{endpoint}', headers=self.headers, json=data)
|
||||
resp.raise_for_status()
|
||||
return resp.json()
|
||||
|
||||
# === HUMANS ===
|
||||
|
||||
def get_human(self, human_id: int) -> Optional[dict]:
|
||||
try:
|
||||
return self._get(f'/humans/{human_id}')
|
||||
except:
|
||||
return None
|
||||
|
||||
def get_humans(self, platform: str = None, user_type: str = None,
|
||||
min_score: float = 0, limit: int = 100, offset: int = 0) -> List[dict]:
|
||||
params = {'min_score': min_score, 'limit': limit, 'offset': offset}
|
||||
if platform:
|
||||
params['platform'] = platform
|
||||
if user_type:
|
||||
params['user_type'] = user_type
|
||||
result = self._get('/humans', params)
|
||||
return result.get('humans', [])
|
||||
|
||||
def get_all_humans(self, min_score: float = 0, limit: int = 100000) -> List[dict]:
|
||||
"""get all humans (for matching)"""
|
||||
return self.get_humans(min_score=min_score, limit=limit)
|
||||
|
||||
def get_lost_builders(self, min_score: float = 30, limit: int = 100) -> List[dict]:
|
||||
"""get lost builders for outreach"""
|
||||
return self.get_humans(user_type='lost', min_score=min_score, limit=limit)
|
||||
|
||||
def get_builders(self, min_score: float = 50, limit: int = 100) -> List[dict]:
|
||||
"""get active builders"""
|
||||
return self.get_humans(user_type='builder', min_score=min_score, limit=limit)
|
||||
|
||||
def upsert_human(self, human: dict) -> int:
|
||||
"""create or update human, returns id"""
|
||||
result = self._post('/humans', human)
|
||||
return result.get('id')
|
||||
|
||||
def upsert_humans_bulk(self, humans: List[dict]) -> Tuple[int, int]:
|
||||
"""bulk upsert humans, returns (created, updated)"""
|
||||
result = self._post('/humans/bulk', humans)
|
||||
return result.get('created', 0), result.get('updated', 0)
|
||||
|
||||
# === MATCHES ===
|
||||
|
||||
def get_matches(self, min_score: float = 0, limit: int = 100, offset: int = 0) -> List[dict]:
|
||||
params = {'min_score': min_score, 'limit': limit, 'offset': offset}
|
||||
result = self._get('/matches', params)
|
||||
return result.get('matches', [])
|
||||
|
||||
def create_match(self, human_a_id: int, human_b_id: int,
|
||||
overlap_score: float, overlap_reasons: str = None) -> int:
|
||||
"""create match, returns id"""
|
||||
result = self._post('/matches', {
|
||||
'human_a_id': human_a_id,
|
||||
'human_b_id': human_b_id,
|
||||
'overlap_score': overlap_score,
|
||||
'overlap_reasons': overlap_reasons
|
||||
})
|
||||
return result.get('id')
|
||||
|
||||
def create_matches_bulk(self, matches: List[dict]) -> int:
|
||||
"""bulk create matches, returns count"""
|
||||
result = self._post('/matches/bulk', matches)
|
||||
return result.get('created', 0)
|
||||
|
||||
# === OUTREACH COORDINATION ===
|
||||
|
||||
def get_pending_outreach(self, outreach_type: str = None, limit: int = 50) -> List[dict]:
|
||||
"""get pending outreach that hasn't been claimed"""
|
||||
params = {'limit': limit}
|
||||
if outreach_type:
|
||||
params['outreach_type'] = outreach_type
|
||||
result = self._get('/outreach/pending', params)
|
||||
return result.get('pending', [])
|
||||
|
||||
def claim_outreach(self, human_id: int, match_id: int = None,
|
||||
outreach_type: str = 'intro') -> Optional[int]:
|
||||
"""claim outreach for a human, returns outreach_id or None if already claimed"""
|
||||
try:
|
||||
result = self._post('/outreach/claim', {
|
||||
'human_id': human_id,
|
||||
'match_id': match_id,
|
||||
'outreach_type': outreach_type
|
||||
})
|
||||
return result.get('outreach_id')
|
||||
except requests.exceptions.HTTPError as e:
|
||||
if e.response.status_code == 409:
|
||||
return None # already claimed by another instance
|
||||
raise
|
||||
|
||||
def complete_outreach(self, outreach_id: int, status: str,
|
||||
sent_via: str = None, draft: str = None, error: str = None):
|
||||
"""mark outreach as complete"""
|
||||
self._post('/outreach/complete', {
|
||||
'outreach_id': outreach_id,
|
||||
'status': status,
|
||||
'sent_via': sent_via,
|
||||
'draft': draft,
|
||||
'error': error
|
||||
})
|
||||
|
||||
def get_outreach_history(self, status: str = None, limit: int = 100) -> List[dict]:
|
||||
params = {'limit': limit}
|
||||
if status:
|
||||
params['status'] = status
|
||||
result = self._get('/outreach/history', params)
|
||||
return result.get('history', [])
|
||||
|
||||
def already_contacted(self, human_id: int) -> bool:
|
||||
"""check if human has been contacted"""
|
||||
history = self._get('/outreach/history', {'limit': 10000})
|
||||
sent = history.get('history', [])
|
||||
return any(h['human_id'] == human_id and h['status'] == 'sent' for h in sent)
|
||||
|
||||
# === STATS ===
|
||||
|
||||
def get_stats(self) -> dict:
|
||||
return self._get('/stats')
|
||||
|
||||
# === INSTANCE MANAGEMENT ===
|
||||
|
||||
def register_instance(self, name: str, host: str):
|
||||
"""register this instance with central"""
|
||||
self._post(f'/instances/register?name={name}&host={host}', {})
|
||||
|
||||
def get_instances(self) -> List[dict]:
|
||||
result = self._get('/instances')
|
||||
return result.get('instances', [])
|
||||
|
||||
# === HEALTH ===
|
||||
|
||||
def health_check(self) -> bool:
|
||||
try:
|
||||
result = self._get('/health')
|
||||
return result.get('status') == 'ok'
|
||||
except:
|
||||
return False
|
||||
|
||||
|
||||
# convenience function
|
||||
|
||||
# === TOKENS ===
|
||||
|
||||
def get_token(self, user_id: int, match_id: int = None) -> str:
|
||||
"""get or create a token for a user"""
|
||||
params = {}
|
||||
if match_id:
|
||||
params['match_id'] = match_id
|
||||
result = self._get(f'/api/token/{user_id}', params)
|
||||
return result.get('token')
|
||||
|
||||
def get_interested_count(self, user_id: int) -> int:
|
||||
"""get count of people interested in this user"""
|
||||
try:
|
||||
result = self._get(f'/api/interested_count/{user_id}')
|
||||
return result.get('count', 0)
|
||||
except:
|
||||
return 0
|
||||
|
||||
|
||||
# convenience function
|
||||
def get_client() -> CentralClient:
|
||||
return CentralClient()
|
||||
49
config.py
49
config.py
|
|
@ -21,8 +21,8 @@ CACHE_DIR.mkdir(exist_ok=True)
|
|||
# === DAEMON CONFIG ===
|
||||
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
|
||||
MATCH_INTERVAL = 3600 # check matches every hour
|
||||
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
|
||||
MAX_INTROS_PER_DAY = 20 # rate limit builder-to-builder outreach
|
||||
INTRO_INTERVAL = 1800 # send intros every 2 hours
|
||||
MAX_INTROS_PER_DAY = 1000 # rate limit builder-to-builder outreach
|
||||
|
||||
|
||||
# === MATCHING CONFIG ===
|
||||
|
|
@ -42,7 +42,7 @@ LOST_CONFIG = {
|
|||
|
||||
# outreach settings
|
||||
'enabled': True,
|
||||
'max_per_day': 5, # lower volume, higher care
|
||||
'max_per_day': 100, # lower volume, higher care
|
||||
'require_review': False, # fully autonomous
|
||||
'cooldown_days': 90, # don't spam struggling people
|
||||
|
||||
|
|
@ -67,9 +67,50 @@ LOST_CONFIG = {
|
|||
|
||||
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
|
||||
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
|
||||
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
|
||||
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.3-70b-versatile')
|
||||
|
||||
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
|
||||
|
||||
# === FORGE TOKENS ===
|
||||
# for creating issues on self-hosted git forges
|
||||
# each forge needs its own token from that instance
|
||||
#
|
||||
# CODEBERG: Settings -> Applications -> Generate Token (repo:write scope)
|
||||
# GITEA/FORGEJO: Settings -> Applications -> Generate Token
|
||||
# GITLAB: Settings -> Access Tokens -> Personal Access Token (api scope)
|
||||
# SOURCEHUT: Settings -> Personal Access Tokens (uses email instead)
|
||||
|
||||
CODEBERG_TOKEN = os.environ.get('CODEBERG_TOKEN', '')
|
||||
GITEA_TOKENS = {} # instance_url -> token, loaded from env
|
||||
GITLAB_TOKENS = {} # instance_url -> token, loaded from env
|
||||
|
||||
# parse GITEA_TOKENS from env
|
||||
# format: GITEA_TOKEN_192_168_1_8_3259=token -> http://192.168.1.8:3259
|
||||
# format: GITEA_TOKEN_codeberg_org=token -> https://codeberg.org
|
||||
def _parse_instance_url(env_key, prefix):
|
||||
"""convert env key to instance URL"""
|
||||
raw = env_key.replace(prefix, '')
|
||||
parts = raw.split('_')
|
||||
|
||||
# check if last part is a port number
|
||||
if parts[-1].isdigit() and len(parts[-1]) <= 5:
|
||||
port = parts[-1]
|
||||
host = '.'.join(parts[:-1])
|
||||
# local IPs use http
|
||||
if host.startswith('192.168.') or host.startswith('10.') or host == 'localhost':
|
||||
return f'http://{host}:{port}'
|
||||
return f'https://{host}:{port}'
|
||||
else:
|
||||
host = '.'.join(parts)
|
||||
return f'https://{host}'
|
||||
|
||||
for key, value in os.environ.items():
|
||||
if key.startswith('GITEA_TOKEN_'):
|
||||
url = _parse_instance_url(key, 'GITEA_TOKEN_')
|
||||
GITEA_TOKENS[url] = value
|
||||
elif key.startswith('GITLAB_TOKEN_'):
|
||||
url = _parse_instance_url(key, 'GITLAB_TOKEN_')
|
||||
GITLAB_TOKENS[url] = value
|
||||
MASTODON_TOKEN = os.environ.get('MASTODON_TOKEN', '')
|
||||
MASTODON_INSTANCE = os.environ.get('MASTODON_INSTANCE', '')
|
||||
|
||||
|
|
|
|||
485
daemon.py
485
daemon.py
|
|
@ -1,25 +1,22 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
connectd daemon - continuous discovery and matchmaking
|
||||
|
||||
two modes of operation:
|
||||
1. priority matching: find matches FOR hosts who run connectd
|
||||
2. altruistic matching: connect strangers to each other
|
||||
|
||||
runs continuously, respects rate limits, sends intros automatically
|
||||
REWIRED TO USE CENTRAL DATABASE
|
||||
"""
|
||||
|
||||
import time
|
||||
import json
|
||||
import signal
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
|
||||
from db import Database
|
||||
from db.users import (init_users_table, get_priority_users, save_priority_match,
|
||||
get_priority_user_matches, discover_host_user)
|
||||
get_priority_user_matches, discover_host_user, mark_match_viewed)
|
||||
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_lemmy, scrape_discord
|
||||
from scoutd.forges import scrape_all_forges
|
||||
from config import HOST_USER
|
||||
from scoutd.github import analyze_github_user, get_github_user
|
||||
from scoutd.signals import analyze_text
|
||||
|
|
@ -32,21 +29,41 @@ from introd.send import send_email
|
|||
from introd.deliver import deliver_intro, determine_best_contact
|
||||
from config import get_lost_config
|
||||
from api import start_api_thread, update_daemon_state
|
||||
from central_client import CentralClient, get_client
|
||||
|
||||
|
||||
class DummyDb:
|
||||
"""dummy db that does nothing - scrapers save here but we push to central"""
|
||||
def save_human(self, human): pass
|
||||
def save_match(self, *args, **kwargs): pass
|
||||
def get_human(self, *args, **kwargs): return None
|
||||
def close(self): pass
|
||||
|
||||
|
||||
# daemon config
|
||||
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
|
||||
MATCH_INTERVAL = 3600 # check matches every hour
|
||||
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
|
||||
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours (lower volume)
|
||||
MAX_INTROS_PER_DAY = 20 # rate limit outreach
|
||||
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
|
||||
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
|
||||
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours
|
||||
from config import MAX_INTROS_PER_DAY
|
||||
|
||||
MIN_OVERLAP_PRIORITY = 30
|
||||
MIN_OVERLAP_STRANGERS = 50
|
||||
|
||||
|
||||
class ConnectDaemon:
|
||||
def __init__(self, dry_run=False):
|
||||
self.db = Database()
|
||||
init_users_table(self.db.conn)
|
||||
# local db only for priority_users (host-specific)
|
||||
self.local_db = Database()
|
||||
init_users_table(self.local_db.conn)
|
||||
|
||||
# CENTRAL for all humans/matches
|
||||
self.central = get_client()
|
||||
if not self.central:
|
||||
raise RuntimeError("CENTRAL API REQUIRED - set CONNECTD_API_KEY and CONNECTD_CENTRAL_API")
|
||||
|
||||
self.log("connected to CENTRAL database")
|
||||
|
||||
self.running = True
|
||||
self.dry_run = dry_run
|
||||
self.started_at = datetime.now()
|
||||
|
|
@ -58,16 +75,17 @@ class ConnectDaemon:
|
|||
self.lost_intros_today = 0
|
||||
self.today = datetime.now().date()
|
||||
|
||||
# handle shutdown gracefully
|
||||
# register instance
|
||||
instance_id = os.environ.get('CONNECTD_INSTANCE_ID', 'daemon')
|
||||
self.central.register_instance(instance_id, os.environ.get('CONNECTD_INSTANCE_IP', 'unknown'))
|
||||
|
||||
signal.signal(signal.SIGINT, self._shutdown)
|
||||
signal.signal(signal.SIGTERM, self._shutdown)
|
||||
|
||||
# auto-discover host user from env
|
||||
if HOST_USER:
|
||||
self.log(f"HOST_USER set: {HOST_USER}")
|
||||
discover_host_user(self.db.conn, HOST_USER)
|
||||
discover_host_user(self.local_db.conn, HOST_USER)
|
||||
|
||||
# update API state
|
||||
self._update_api_state()
|
||||
|
||||
def _shutdown(self, signum, frame):
|
||||
|
|
@ -76,10 +94,8 @@ class ConnectDaemon:
|
|||
self._update_api_state()
|
||||
|
||||
def _update_api_state(self):
|
||||
"""update API state for HA integration"""
|
||||
now = datetime.now()
|
||||
|
||||
# calculate countdowns - if no cycle has run, use started_at
|
||||
def secs_until(last, interval):
|
||||
base = last if last else self.started_at
|
||||
next_run = base + timedelta(seconds=interval)
|
||||
|
|
@ -103,11 +119,9 @@ class ConnectDaemon:
|
|||
})
|
||||
|
||||
def log(self, msg):
|
||||
"""timestamped log"""
|
||||
print(f"[{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}] {msg}")
|
||||
|
||||
def reset_daily_limits(self):
|
||||
"""reset daily intro count"""
|
||||
if datetime.now().date() != self.today:
|
||||
self.today = datetime.now().date()
|
||||
self.intros_today = 0
|
||||
|
|
@ -115,89 +129,106 @@ class ConnectDaemon:
|
|||
self.log("reset daily intro limits")
|
||||
|
||||
def scout_cycle(self):
|
||||
"""run discovery on all platforms"""
|
||||
self.log("starting scout cycle...")
|
||||
"""run discovery - scrape to CENTRAL"""
|
||||
self.log("starting scout cycle (-> CENTRAL)...")
|
||||
|
||||
# dummy db - scrapers save here but we push to central
|
||||
dummy_db = DummyDb()
|
||||
scraped_humans = []
|
||||
|
||||
try:
|
||||
scrape_github(self.db, limit_per_source=30)
|
||||
# github - returns list of humans
|
||||
from scoutd.github import scrape_github
|
||||
gh_humans = scrape_github(dummy_db, limit_per_source=30)
|
||||
if gh_humans:
|
||||
scraped_humans.extend(gh_humans)
|
||||
self.log(f" github: {len(gh_humans) if gh_humans else 0} humans")
|
||||
except Exception as e:
|
||||
self.log(f"github scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_reddit(self.db, limit_per_sub=30)
|
||||
from scoutd.reddit import scrape_reddit
|
||||
reddit_humans = scrape_reddit(dummy_db, limit_per_sub=30)
|
||||
if reddit_humans:
|
||||
scraped_humans.extend(reddit_humans)
|
||||
self.log(f" reddit: {len(reddit_humans) if reddit_humans else 0} humans")
|
||||
except Exception as e:
|
||||
self.log(f"reddit scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_mastodon(self.db, limit_per_instance=30)
|
||||
from scoutd.mastodon import scrape_mastodon
|
||||
masto_humans = scrape_mastodon(dummy_db, limit_per_instance=30)
|
||||
if masto_humans:
|
||||
scraped_humans.extend(masto_humans)
|
||||
self.log(f" mastodon: {len(masto_humans) if masto_humans else 0} humans")
|
||||
except Exception as e:
|
||||
self.log(f"mastodon scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_lobsters(self.db)
|
||||
forge_humans = scrape_all_forges(limit_per_instance=30)
|
||||
if forge_humans:
|
||||
scraped_humans.extend(forge_humans)
|
||||
self.log(f" forges: {len(forge_humans) if forge_humans else 0} humans")
|
||||
except Exception as e:
|
||||
self.log(f"forge scout error: {e}")
|
||||
|
||||
try:
|
||||
from scoutd.lobsters import scrape_lobsters
|
||||
lob_humans = scrape_lobsters(dummy_db)
|
||||
if lob_humans:
|
||||
scraped_humans.extend(lob_humans)
|
||||
self.log(f" lobsters: {len(lob_humans) if lob_humans else 0} humans")
|
||||
except Exception as e:
|
||||
self.log(f"lobsters scout error: {e}")
|
||||
|
||||
# push all to central
|
||||
if scraped_humans:
|
||||
self.log(f"pushing {len(scraped_humans)} humans to CENTRAL...")
|
||||
try:
|
||||
scrape_lemmy(self.db, limit_per_community=30)
|
||||
created, updated = self.central.upsert_humans_bulk(scraped_humans)
|
||||
self.log(f" central: {created} created, {updated} updated")
|
||||
except Exception as e:
|
||||
self.log(f"lemmy scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_discord(self.db, limit_per_channel=50)
|
||||
except Exception as e:
|
||||
self.log(f"discord scout error: {e}")
|
||||
self.log(f" central push error: {e}")
|
||||
|
||||
self.last_scout = datetime.now()
|
||||
stats = self.db.stats()
|
||||
self.log(f"scout complete: {stats['total_humans']} humans in db")
|
||||
stats = self.central.get_stats()
|
||||
self.log(f"scout complete: {stats.get('total_humans', 0)} humans in CENTRAL")
|
||||
|
||||
def match_priority_users(self):
|
||||
"""find matches for priority users (hosts)"""
|
||||
priority_users = get_priority_users(self.db.conn)
|
||||
"""find matches for priority users (hosts) using CENTRAL data"""
|
||||
priority_users = get_priority_users(self.local_db.conn)
|
||||
|
||||
if not priority_users:
|
||||
return
|
||||
|
||||
self.log(f"matching for {len(priority_users)} priority users...")
|
||||
self.log(f"matching for {len(priority_users)} priority users (from CENTRAL)...")
|
||||
|
||||
humans = self.db.get_all_humans(min_score=20, limit=500)
|
||||
# get humans from CENTRAL
|
||||
humans = self.central.get_all_humans(min_score=20)
|
||||
|
||||
for puser in priority_users:
|
||||
# build priority user's fingerprint from their linked profiles
|
||||
# use stored signals first (from discovery/scoring)
|
||||
puser_signals = []
|
||||
puser_text = []
|
||||
if puser.get('signals'):
|
||||
stored = puser['signals']
|
||||
if isinstance(stored, str):
|
||||
try:
|
||||
stored = json.loads(stored)
|
||||
except:
|
||||
stored = []
|
||||
puser_signals.extend(stored)
|
||||
|
||||
if puser.get('bio'):
|
||||
puser_text.append(puser['bio'])
|
||||
if puser.get('interests'):
|
||||
# supplement with interests if no signals stored
|
||||
if not puser_signals and puser.get('interests'):
|
||||
interests = json.loads(puser['interests']) if isinstance(puser['interests'], str) else puser['interests']
|
||||
puser_signals.extend(interests)
|
||||
if puser.get('looking_for'):
|
||||
puser_text.append(puser['looking_for'])
|
||||
|
||||
# analyze their linked github if available
|
||||
if puser.get('github'):
|
||||
gh_user = analyze_github_user(puser['github'])
|
||||
if gh_user:
|
||||
puser_signals.extend(gh_user.get('signals', []))
|
||||
if not puser_signals:
|
||||
self.log(f" skipping {puser.get('name')} - no signals")
|
||||
continue
|
||||
|
||||
puser_fingerprint = {
|
||||
'values_vector': {},
|
||||
'skills': {},
|
||||
'interests': list(set(puser_signals)),
|
||||
'location_pref': 'pnw' if puser.get('location') and 'seattle' in puser['location'].lower() else None,
|
||||
}
|
||||
|
||||
# score text
|
||||
if puser_text:
|
||||
_, text_signals, _ = analyze_text(' '.join(puser_text))
|
||||
puser_signals.extend(text_signals)
|
||||
|
||||
# find matches
|
||||
matches_found = 0
|
||||
for human in humans:
|
||||
# skip if it's their own profile on another platform
|
||||
human_user = human.get('username', '').lower()
|
||||
if puser.get('github') and human_user == puser['github'].lower():
|
||||
continue
|
||||
|
|
@ -206,17 +237,18 @@ class ConnectDaemon:
|
|||
if puser.get('mastodon') and human_user == puser['mastodon'].lower().split('@')[0]:
|
||||
continue
|
||||
|
||||
# calculate overlap
|
||||
human_signals = human.get('signals', [])
|
||||
if isinstance(human_signals, str):
|
||||
try:
|
||||
human_signals = json.loads(human_signals)
|
||||
except:
|
||||
human_signals = []
|
||||
|
||||
shared = set(puser_signals) & set(human_signals)
|
||||
overlap_score = len(shared) * 10
|
||||
|
||||
# location bonus
|
||||
if puser.get('location') and human.get('location'):
|
||||
if 'seattle' in human['location'].lower() or 'pnw' in human['location'].lower():
|
||||
if 'seattle' in str(human.get('location', '')).lower() or 'pnw' in str(human.get('location', '')).lower():
|
||||
overlap_score += 20
|
||||
|
||||
if overlap_score >= MIN_OVERLAP_PRIORITY:
|
||||
|
|
@ -224,33 +256,31 @@ class ConnectDaemon:
|
|||
'overlap_score': overlap_score,
|
||||
'overlap_reasons': [f"shared: {', '.join(list(shared)[:5])}"] if shared else [],
|
||||
}
|
||||
save_priority_match(self.db.conn, puser['id'], human['id'], overlap_data)
|
||||
save_priority_match(self.local_db.conn, puser['id'], human['id'], overlap_data)
|
||||
matches_found += 1
|
||||
|
||||
if matches_found:
|
||||
self.log(f" found {matches_found} matches for {puser['name'] or puser['email']}")
|
||||
|
||||
def match_strangers(self):
|
||||
"""find matches between discovered humans (altruistic)"""
|
||||
self.log("matching strangers...")
|
||||
"""find matches between discovered humans - save to CENTRAL"""
|
||||
self.log("matching strangers (-> CENTRAL)...")
|
||||
|
||||
humans = self.db.get_all_humans(min_score=40, limit=200)
|
||||
humans = self.central.get_all_humans(min_score=40)
|
||||
|
||||
if len(humans) < 2:
|
||||
return
|
||||
|
||||
# generate fingerprints
|
||||
fingerprints = {}
|
||||
for human in humans:
|
||||
fp = generate_fingerprint(human)
|
||||
fingerprints[human['id']] = fp
|
||||
|
||||
# find pairs
|
||||
matches_found = 0
|
||||
new_matches = []
|
||||
from itertools import combinations
|
||||
|
||||
for human_a, human_b in combinations(humans, 2):
|
||||
# skip same platform same user
|
||||
if human_a['platform'] == human_b['platform']:
|
||||
if human_a['username'] == human_b['username']:
|
||||
continue
|
||||
|
|
@ -260,41 +290,36 @@ class ConnectDaemon:
|
|||
|
||||
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
|
||||
|
||||
if overlap['overlap_score'] >= MIN_OVERLAP_STRANGERS:
|
||||
# save match
|
||||
self.db.save_match(human_a['id'], human_b['id'], overlap)
|
||||
if overlap and overlap["overlap_score"] >= MIN_OVERLAP_STRANGERS:
|
||||
new_matches.append({
|
||||
'human_a_id': human_a['id'],
|
||||
'human_b_id': human_b['id'],
|
||||
'overlap_score': overlap['overlap_score'],
|
||||
'overlap_reasons': json.dumps(overlap.get('overlap_reasons', []))
|
||||
})
|
||||
matches_found += 1
|
||||
|
||||
if matches_found:
|
||||
self.log(f"found {matches_found} stranger matches")
|
||||
# bulk push to central
|
||||
if new_matches:
|
||||
self.log(f"pushing {len(new_matches)} matches to CENTRAL...")
|
||||
try:
|
||||
created = self.central.create_matches_bulk(new_matches)
|
||||
self.log(f" central: {created} matches created")
|
||||
except Exception as e:
|
||||
self.log(f" central push error: {e}")
|
||||
|
||||
self.last_match = datetime.now()
|
||||
|
||||
def send_stranger_intros(self):
|
||||
"""send intros to connect strangers (or preview in dry-run mode)"""
|
||||
"""send intros using CENTRAL data"""
|
||||
self.reset_daily_limits()
|
||||
|
||||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
self.log("daily intro limit reached")
|
||||
return
|
||||
|
||||
# get unsent matches
|
||||
c = self.db.conn.cursor()
|
||||
c.execute('''SELECT m.*,
|
||||
ha.id as a_id, ha.username as a_user, ha.platform as a_platform,
|
||||
ha.name as a_name, ha.url as a_url, ha.contact as a_contact,
|
||||
ha.signals as a_signals, ha.extra as a_extra,
|
||||
hb.id as b_id, hb.username as b_user, hb.platform as b_platform,
|
||||
hb.name as b_name, hb.url as b_url, hb.contact as b_contact,
|
||||
hb.signals as b_signals, hb.extra as b_extra
|
||||
FROM matches m
|
||||
JOIN humans ha ON m.human_a_id = ha.id
|
||||
JOIN humans hb ON m.human_b_id = hb.id
|
||||
WHERE m.status = 'pending'
|
||||
ORDER BY m.overlap_score DESC
|
||||
LIMIT 10''')
|
||||
|
||||
matches = c.fetchall()
|
||||
# get pending matches from CENTRAL
|
||||
matches = self.central.get_matches(min_score=MIN_OVERLAP_STRANGERS, limit=20)
|
||||
|
||||
if self.dry_run:
|
||||
self.log(f"DRY RUN: previewing {len(matches)} potential intros")
|
||||
|
|
@ -303,59 +328,60 @@ class ConnectDaemon:
|
|||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
break
|
||||
|
||||
match = dict(match)
|
||||
# get full human data
|
||||
human_a = self.central.get_human(match['human_a_id'])
|
||||
human_b = self.central.get_human(match['human_b_id'])
|
||||
|
||||
# build human dicts
|
||||
human_a = {
|
||||
'id': match['a_id'],
|
||||
'username': match['a_user'],
|
||||
'platform': match['a_platform'],
|
||||
'name': match['a_name'],
|
||||
'url': match['a_url'],
|
||||
'contact': match['a_contact'],
|
||||
'signals': match['a_signals'],
|
||||
'extra': match['a_extra'],
|
||||
}
|
||||
human_b = {
|
||||
'id': match['b_id'],
|
||||
'username': match['b_user'],
|
||||
'platform': match['b_platform'],
|
||||
'name': match['b_name'],
|
||||
'url': match['b_url'],
|
||||
'contact': match['b_contact'],
|
||||
'signals': match['b_signals'],
|
||||
'extra': match['b_extra'],
|
||||
}
|
||||
if not human_a or not human_b:
|
||||
continue
|
||||
|
||||
match_data = {
|
||||
'id': match['id'],
|
||||
'human_a': human_a,
|
||||
'human_b': human_b,
|
||||
'overlap_score': match['overlap_score'],
|
||||
'overlap_reasons': match['overlap_reasons'],
|
||||
'overlap_reasons': match.get('overlap_reasons', ''),
|
||||
}
|
||||
|
||||
# try to send intro to person with email
|
||||
for recipient, other in [(human_a, human_b), (human_b, human_a)]:
|
||||
contact = recipient.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
try:
|
||||
contact = json.loads(contact)
|
||||
except:
|
||||
contact = {}
|
||||
|
||||
email = contact.get('email')
|
||||
if not email:
|
||||
continue
|
||||
|
||||
# draft intro
|
||||
intro = draft_intro(match_data, recipient='a' if recipient == human_a else 'b')
|
||||
# check if already contacted
|
||||
if self.central.already_contacted(recipient['id']):
|
||||
continue
|
||||
|
||||
# parse overlap reasons for display
|
||||
reasons = match['overlap_reasons']
|
||||
# get token and interest count for recipient
|
||||
try:
|
||||
recipient_token = self.central.get_token(recipient['id'], match.get('id'))
|
||||
interested_count = self.central.get_interested_count(recipient['id'])
|
||||
except Exception as e:
|
||||
print(f"[intro] failed to get token/count: {e}")
|
||||
recipient_token = None
|
||||
interested_count = 0
|
||||
|
||||
intro = draft_intro(match_data,
|
||||
recipient='a' if recipient == human_a else 'b',
|
||||
recipient_token=recipient_token,
|
||||
interested_count=interested_count)
|
||||
|
||||
reasons = match.get('overlap_reasons', '')
|
||||
if isinstance(reasons, str):
|
||||
try:
|
||||
reasons = json.loads(reasons)
|
||||
except:
|
||||
reasons = []
|
||||
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
|
||||
|
||||
if self.dry_run:
|
||||
# print preview
|
||||
print("\n" + "=" * 60)
|
||||
print(f"TO: {recipient['username']} ({recipient['platform']})")
|
||||
print(f"EMAIL: {email}")
|
||||
|
|
@ -369,7 +395,11 @@ class ConnectDaemon:
|
|||
print("=" * 60)
|
||||
break
|
||||
else:
|
||||
# actually send
|
||||
outreach_id = self.central.claim_outreach(recipient['id'], match['id'], 'intro')
|
||||
if outreach_id is None:
|
||||
self.log(f"skipping {recipient['username']} - already claimed")
|
||||
continue
|
||||
|
||||
success, error = send_email(
|
||||
email,
|
||||
f"connectd: you might want to meet {other['username']}",
|
||||
|
|
@ -379,22 +409,124 @@ class ConnectDaemon:
|
|||
if success:
|
||||
self.log(f"sent intro to {recipient['username']} ({email})")
|
||||
self.intros_today += 1
|
||||
|
||||
# mark match as intro_sent
|
||||
c.execute('UPDATE matches SET status = "intro_sent" WHERE id = ?',
|
||||
(match['id'],))
|
||||
self.db.conn.commit()
|
||||
self.central.complete_outreach(outreach_id, 'sent', 'email', intro['draft'])
|
||||
break
|
||||
else:
|
||||
self.log(f"failed to send to {email}: {error}")
|
||||
self.central.complete_outreach(outreach_id, 'failed', error=error)
|
||||
|
||||
self.last_intro = datetime.now()
|
||||
|
||||
def send_priority_user_intros(self):
|
||||
"""send intros TO priority users (hosts) about their matches"""
|
||||
self.reset_daily_limits()
|
||||
|
||||
priority_users = get_priority_users(self.local_db.conn)
|
||||
if not priority_users:
|
||||
return
|
||||
|
||||
self.log(f"checking intros for {len(priority_users)} priority users...")
|
||||
|
||||
for puser in priority_users:
|
||||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
break
|
||||
|
||||
# get email
|
||||
email = puser.get('email')
|
||||
if not email:
|
||||
continue
|
||||
|
||||
# get their matches from local priority_matches table
|
||||
matches = get_priority_user_matches(self.local_db.conn, puser['id'], status='new', limit=5)
|
||||
|
||||
if not matches:
|
||||
continue
|
||||
|
||||
for match in matches:
|
||||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
break
|
||||
|
||||
# get the matched human from CENTRAL (matched_human_id is central id)
|
||||
human_id = match.get('matched_human_id')
|
||||
if not human_id:
|
||||
continue
|
||||
|
||||
human = self.central.get_human(human_id)
|
||||
if not human:
|
||||
continue
|
||||
|
||||
# build match data for drafting
|
||||
overlap_reasons = match.get('overlap_reasons', '[]')
|
||||
if isinstance(overlap_reasons, str):
|
||||
try:
|
||||
overlap_reasons = json.loads(overlap_reasons)
|
||||
except:
|
||||
overlap_reasons = []
|
||||
|
||||
puser_name = puser.get('name') or puser.get('email', '').split('@')[0]
|
||||
human_name = human.get('name') or human.get('username')
|
||||
|
||||
# draft intro TO priority user ABOUT the matched human
|
||||
match_data = {
|
||||
'id': match.get('id'),
|
||||
'human_a': {
|
||||
'username': puser_name,
|
||||
'platform': 'host',
|
||||
'name': puser_name,
|
||||
'bio': puser.get('bio', ''),
|
||||
'signals': puser.get('signals', []),
|
||||
},
|
||||
'human_b': human,
|
||||
'overlap_score': match.get('overlap_score', 0),
|
||||
'overlap_reasons': overlap_reasons,
|
||||
}
|
||||
|
||||
# try to get token for priority user (they might have a central ID)
|
||||
recipient_token = None
|
||||
interested_count = 0
|
||||
if puser.get('central_id'):
|
||||
try:
|
||||
recipient_token = self.central.get_token(puser['central_id'], match.get('id'))
|
||||
interested_count = self.central.get_interested_count(puser['central_id'])
|
||||
except:
|
||||
pass
|
||||
|
||||
intro = draft_intro(match_data, recipient='a',
|
||||
recipient_token=recipient_token,
|
||||
interested_count=interested_count)
|
||||
|
||||
reason_summary = ', '.join(overlap_reasons[:3]) if overlap_reasons else 'aligned values'
|
||||
|
||||
if self.dry_run:
|
||||
print("\n" + "=" * 60)
|
||||
print("PRIORITY USER INTRO")
|
||||
print("=" * 60)
|
||||
print(f"TO: {puser_name} ({email})")
|
||||
print(f"ABOUT: {human_name} ({human.get('platform')})")
|
||||
print(f"SCORE: {match.get('overlap_score', 0):.0f} ({reason_summary})")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(intro['draft'])
|
||||
print("-" * 60)
|
||||
print("[DRY RUN - NOT SENT]")
|
||||
print("=" * 60)
|
||||
else:
|
||||
success, error = send_email(
|
||||
email,
|
||||
f"connectd: you might want to meet {human_name}",
|
||||
intro['draft']
|
||||
)
|
||||
|
||||
if success:
|
||||
self.log(f"sent priority intro to {puser_name} about {human_name}")
|
||||
self.intros_today += 1
|
||||
# mark match as notified
|
||||
mark_match_viewed(self.local_db.conn, match['id'])
|
||||
else:
|
||||
self.log(f"failed to send priority intro to {email}: {error}")
|
||||
|
||||
def send_lost_builder_intros(self):
|
||||
"""
|
||||
reach out to lost builders - different tone, lower volume.
|
||||
these people need encouragement, not networking.
|
||||
"""
|
||||
"""reach out to lost builders using CENTRAL data"""
|
||||
self.reset_daily_limits()
|
||||
|
||||
lost_config = get_lost_config()
|
||||
|
|
@ -407,43 +539,60 @@ class ConnectDaemon:
|
|||
self.log("daily lost builder intro limit reached")
|
||||
return
|
||||
|
||||
# find lost builders with matching active builders
|
||||
matches, error = find_matches_for_lost_builders(
|
||||
self.db,
|
||||
min_lost_score=lost_config.get('min_lost_score', 40),
|
||||
min_values_score=lost_config.get('min_values_score', 20),
|
||||
# get lost builders from CENTRAL
|
||||
lost_builders = self.central.get_lost_builders(
|
||||
min_score=lost_config.get('min_lost_score', 40),
|
||||
limit=max_per_day - self.lost_intros_today
|
||||
)
|
||||
|
||||
if error:
|
||||
self.log(f"lost builder matching error: {error}")
|
||||
return
|
||||
# get active builders from CENTRAL
|
||||
builders = self.central.get_builders(min_score=50, limit=100)
|
||||
|
||||
if not matches:
|
||||
self.log("no lost builders ready for outreach")
|
||||
if not lost_builders or not builders:
|
||||
self.log("no lost builders or builders available")
|
||||
return
|
||||
|
||||
if self.dry_run:
|
||||
self.log(f"DRY RUN: previewing {len(matches)} lost builder intros")
|
||||
self.log(f"DRY RUN: previewing {len(lost_builders)} lost builder intros")
|
||||
|
||||
for match in matches:
|
||||
for lost in lost_builders:
|
||||
if not self.dry_run and self.lost_intros_today >= max_per_day:
|
||||
break
|
||||
|
||||
lost = match['lost_user']
|
||||
builder = match['inspiring_builder']
|
||||
# find matching builder
|
||||
best_builder = None
|
||||
best_score = 0
|
||||
for builder in builders:
|
||||
lost_signals = lost.get('signals', [])
|
||||
builder_signals = builder.get('signals', [])
|
||||
if isinstance(lost_signals, str):
|
||||
try:
|
||||
lost_signals = json.loads(lost_signals)
|
||||
except:
|
||||
lost_signals = []
|
||||
if isinstance(builder_signals, str):
|
||||
try:
|
||||
builder_signals = json.loads(builder_signals)
|
||||
except:
|
||||
builder_signals = []
|
||||
|
||||
shared = set(lost_signals) & set(builder_signals)
|
||||
if len(shared) > best_score:
|
||||
best_score = len(shared)
|
||||
best_builder = builder
|
||||
|
||||
if not best_builder:
|
||||
continue
|
||||
|
||||
lost_name = lost.get('name') or lost.get('username')
|
||||
builder_name = builder.get('name') or builder.get('username')
|
||||
builder_name = best_builder.get('name') or best_builder.get('username')
|
||||
|
||||
# draft intro
|
||||
draft, draft_error = draft_lost_intro(lost, builder, lost_config)
|
||||
draft, draft_error = draft_lost_intro(lost, best_builder, lost_config)
|
||||
|
||||
if draft_error:
|
||||
self.log(f"error drafting lost intro for {lost_name}: {draft_error}")
|
||||
continue
|
||||
|
||||
# determine best contact method (activity-based)
|
||||
method, contact_info = determine_best_contact(lost)
|
||||
|
||||
if self.dry_run:
|
||||
|
|
@ -453,9 +602,7 @@ class ConnectDaemon:
|
|||
print(f"TO: {lost_name} ({lost.get('platform')})")
|
||||
print(f"DELIVERY: {method} → {contact_info}")
|
||||
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
|
||||
print(f"VALUES SCORE: {lost.get('score', 0)}")
|
||||
print(f"INSPIRING BUILDER: {builder_name}")
|
||||
print(f"SHARED INTERESTS: {', '.join(match.get('shared_interests', []))}")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(draft)
|
||||
|
|
@ -463,12 +610,11 @@ class ConnectDaemon:
|
|||
print("[DRY RUN - NOT SENT]")
|
||||
print("=" * 60)
|
||||
else:
|
||||
# build match data for unified delivery
|
||||
match_data = {
|
||||
'human_a': builder, # inspiring builder
|
||||
'human_b': lost, # lost builder (recipient)
|
||||
'overlap_score': match.get('match_score', 0),
|
||||
'overlap_reasons': match.get('shared_interests', []),
|
||||
'human_a': best_builder,
|
||||
'human_b': lost,
|
||||
'overlap_score': best_score * 10,
|
||||
'overlap_reasons': [],
|
||||
}
|
||||
|
||||
success, error, delivery_method = deliver_intro(match_data, draft)
|
||||
|
|
@ -476,7 +622,6 @@ class ConnectDaemon:
|
|||
if success:
|
||||
self.log(f"sent lost builder intro to {lost_name} via {delivery_method}")
|
||||
self.lost_intros_today += 1
|
||||
self.db.mark_lost_outreach(lost['id'])
|
||||
else:
|
||||
self.log(f"failed to reach {lost_name} via {delivery_method}: {error}")
|
||||
|
||||
|
|
@ -485,9 +630,8 @@ class ConnectDaemon:
|
|||
|
||||
def run(self):
|
||||
"""main daemon loop"""
|
||||
self.log("connectd daemon starting...")
|
||||
self.log("connectd daemon starting (CENTRAL MODE)...")
|
||||
|
||||
# start API server
|
||||
start_api_thread()
|
||||
self.log("api server started on port 8099")
|
||||
|
||||
|
|
@ -506,36 +650,31 @@ class ConnectDaemon:
|
|||
while self.running:
|
||||
now = datetime.now()
|
||||
|
||||
# scout cycle
|
||||
if not self.last_scout or (now - self.last_scout).seconds >= SCOUT_INTERVAL:
|
||||
self.scout_cycle()
|
||||
self._update_api_state()
|
||||
|
||||
# match cycle
|
||||
if not self.last_match or (now - self.last_match).seconds >= MATCH_INTERVAL:
|
||||
self.match_priority_users()
|
||||
self.match_strangers()
|
||||
self._update_api_state()
|
||||
|
||||
# intro cycle
|
||||
if not self.last_intro or (now - self.last_intro).seconds >= INTRO_INTERVAL:
|
||||
self.send_stranger_intros()
|
||||
self.send_priority_user_intros()
|
||||
self._update_api_state()
|
||||
|
||||
# lost builder cycle
|
||||
if not self.last_lost or (now - self.last_lost).seconds >= LOST_INTERVAL:
|
||||
self.send_lost_builder_intros()
|
||||
self._update_api_state()
|
||||
|
||||
# sleep between checks
|
||||
time.sleep(60)
|
||||
|
||||
self.log("connectd daemon stopped")
|
||||
self.db.close()
|
||||
self.local_db.close()
|
||||
|
||||
|
||||
def run_daemon(dry_run=False):
|
||||
"""entry point"""
|
||||
daemon = ConnectDaemon(dry_run=dry_run)
|
||||
daemon.run()
|
||||
|
||||
|
|
|
|||
16
db/users.py
16
db/users.py
|
|
@ -139,20 +139,18 @@ def save_priority_match(conn, priority_user_id, human_id, overlap_data):
|
|||
|
||||
|
||||
def get_priority_user_matches(conn, priority_user_id, status=None, limit=50):
|
||||
"""get matches for a priority user"""
|
||||
"""get matches for a priority user (humans fetched from CENTRAL separately)"""
|
||||
c = conn.cursor()
|
||||
|
||||
if status:
|
||||
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ? AND pm.status = ?
|
||||
ORDER BY pm.overlap_score DESC
|
||||
c.execute('''SELECT * FROM priority_matches
|
||||
WHERE priority_user_id = ? AND status = ?
|
||||
ORDER BY overlap_score DESC
|
||||
LIMIT ?''', (priority_user_id, status, limit))
|
||||
else:
|
||||
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ?
|
||||
ORDER BY pm.overlap_score DESC
|
||||
c.execute('''SELECT * FROM priority_matches
|
||||
WHERE priority_user_id = ?
|
||||
ORDER BY overlap_score DESC
|
||||
LIMIT ?''', (priority_user_id, limit))
|
||||
|
||||
return [dict(row) for row in c.fetchall()]
|
||||
|
|
|
|||
|
|
@ -183,7 +183,7 @@ class Database:
|
|||
row = c.fetchone()
|
||||
return dict(row) if row else None
|
||||
|
||||
def get_all_humans(self, min_score=0, limit=1000):
|
||||
def get_all_humans(self, min_score=0, limit=100000):
|
||||
"""get all humans above score threshold"""
|
||||
c = self.conn.cursor()
|
||||
c.execute('''SELECT * FROM humans
|
||||
|
|
@ -347,10 +347,10 @@ class Database:
|
|||
c.execute('SELECT COUNT(*) FROM matches')
|
||||
stats['total_matches'] = c.fetchone()[0]
|
||||
|
||||
c.execute('SELECT COUNT(*) FROM intros')
|
||||
c.execute('SELECT COUNT(*) FROM matches WHERE status = "intro_sent"')
|
||||
stats['total_intros'] = c.fetchone()[0]
|
||||
|
||||
c.execute('SELECT COUNT(*) FROM intros WHERE status = "sent"')
|
||||
c.execute('SELECT COUNT(*) FROM matches WHERE status = "intro_sent"')
|
||||
stats['sent_intros'] = c.fetchone()[0]
|
||||
|
||||
# lost builder stats
|
||||
|
|
@ -373,3 +373,64 @@ class Database:
|
|||
|
||||
def close(self):
|
||||
self.conn.close()
|
||||
|
||||
def purge_disqualified(self):
|
||||
"""
|
||||
auto-cleanup: remove all matches/intros involving users with disqualifying signals
|
||||
DISQUALIFYING: maga, conspiracy, conservative, antivax, sovcit
|
||||
"""
|
||||
c = self.conn.cursor()
|
||||
purged = {}
|
||||
|
||||
# patterns to match disqualifying signals
|
||||
disq_patterns = ["maga", "conspiracy", "conservative", "antivax", "sovcit"]
|
||||
|
||||
# build WHERE clause for negative_signals check
|
||||
neg_check = " OR ".join([f"negative_signals LIKE '%{p}%'" for p in disq_patterns])
|
||||
|
||||
# 1. delete from intros where recipient is disqualified
|
||||
c.execute(f"""
|
||||
DELETE FROM intros WHERE recipient_human_id IN (
|
||||
SELECT id FROM humans WHERE {neg_check}
|
||||
)
|
||||
""")
|
||||
purged["intros"] = c.rowcount
|
||||
|
||||
# 2. delete from priority_matches where matched_human is disqualified
|
||||
c.execute(f"""
|
||||
DELETE FROM priority_matches WHERE matched_human_id IN (
|
||||
SELECT id FROM humans WHERE {neg_check}
|
||||
)
|
||||
""")
|
||||
purged["priority_matches"] = c.rowcount
|
||||
|
||||
# 3. delete from matches where either human is disqualified
|
||||
c.execute(f"""
|
||||
DELETE FROM matches WHERE
|
||||
human_a_id IN (SELECT id FROM humans WHERE {neg_check})
|
||||
OR human_b_id IN (SELECT id FROM humans WHERE {neg_check})
|
||||
""")
|
||||
purged["matches"] = c.rowcount
|
||||
|
||||
# 4. cleanup orphaned records (humans deleted but refs remain)
|
||||
c.execute("""
|
||||
DELETE FROM matches WHERE
|
||||
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = human_a_id)
|
||||
OR NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = human_b_id)
|
||||
""")
|
||||
purged["orphaned_matches"] = c.rowcount
|
||||
|
||||
c.execute("""
|
||||
DELETE FROM priority_matches WHERE
|
||||
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = matched_human_id)
|
||||
""")
|
||||
purged["orphaned_priority"] = c.rowcount
|
||||
|
||||
c.execute("""
|
||||
DELETE FROM intros WHERE
|
||||
NOT EXISTS (SELECT 1 FROM humans h WHERE h.id = recipient_human_id)
|
||||
""")
|
||||
purged["orphaned_intros"] = c.rowcount
|
||||
|
||||
self.conn.commit()
|
||||
return purged
|
||||
|
|
@ -39,6 +39,36 @@ from .github import get_github_user, get_user_repos, _api_get as github_api
|
|||
from .mastodon import analyze_mastodon_user, _api_get as mastodon_api
|
||||
from .handles import discover_all_handles, extract_handles_from_text, scrape_website_for_handles
|
||||
|
||||
# MASTODON HANDLE FILTER - don't treat these as emails
|
||||
MASTODON_INSTANCES = [
|
||||
'mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt',
|
||||
'social.coop', 'masto.ai', 'infosec.exchange', 'hackers.town',
|
||||
'chaos.social', 'mathstodon.xyz', 'scholar.social', 'mas.to',
|
||||
'mstdn.social', 'mastodon.online', 'universeodon.com', 'mastodon.world',
|
||||
]
|
||||
|
||||
def is_mastodon_handle(email):
|
||||
"""check if string looks like mastodon handle not email"""
|
||||
if not email or '@' not in email:
|
||||
return False
|
||||
email_lower = email.lower()
|
||||
# check for @username@instance pattern
|
||||
parts = email_lower.split('@')
|
||||
if len(parts) == 3 and parts[0] == '': # @user@instance
|
||||
return True
|
||||
if len(parts) == 2:
|
||||
# check if domain is known mastodon instance
|
||||
domain = parts[1]
|
||||
for instance in MASTODON_INSTANCES:
|
||||
if domain == instance or domain.endswith('.' + instance):
|
||||
return True
|
||||
# also check common patterns
|
||||
if 'mastodon' in domain or 'masto' in domain:
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
|
||||
# local cache for org memberships
|
||||
ORG_CACHE_FILE = Path(__file__).parent.parent / 'data' / 'org_cache.json'
|
||||
_org_cache = None
|
||||
|
|
@ -674,7 +704,8 @@ def deep_scrape_github_user(login, scrape_commits=True):
|
|||
profile['emails'].extend(website_emails)
|
||||
|
||||
# dedupe emails and pick best one
|
||||
profile['emails'] = list(set(profile['emails']))
|
||||
# FILTER OUT MASTODON HANDLES (they're not emails!)
|
||||
profile['emails'] = [e for e in set(profile['emails']) if e and not is_mastodon_handle(e)]
|
||||
|
||||
# rank emails by preference
|
||||
def email_score(email):
|
||||
|
|
@ -147,6 +147,87 @@ def create_github_issue(owner, repo, title, body, dry_run=False):
|
|||
return False, str(e)
|
||||
|
||||
|
||||
def create_forge_issue(platform_type, instance_url, owner, repo, title, body, dry_run=False):
|
||||
"""
|
||||
create issue on self-hosted git forge.
|
||||
supports gitea/forgejo/gogs (same API) and gitlab.
|
||||
"""
|
||||
from config import CODEBERG_TOKEN, GITEA_TOKENS, GITLAB_TOKENS
|
||||
|
||||
if dry_run:
|
||||
print(f" [dry run] would create issue on {platform_type}:{instance_url}/{owner}/{repo}")
|
||||
return True, None
|
||||
|
||||
try:
|
||||
if platform_type in ('gitea', 'forgejo', 'gogs'):
|
||||
# get token for this instance
|
||||
token = None
|
||||
if 'codeberg.org' in instance_url:
|
||||
token = CODEBERG_TOKEN
|
||||
else:
|
||||
token = GITEA_TOKENS.get(instance_url)
|
||||
|
||||
if not token:
|
||||
return False, f"no auth token for {instance_url}"
|
||||
|
||||
# gitea API
|
||||
api_url = f"{instance_url}/api/v1/repos/{owner}/{repo}/issues"
|
||||
headers = {
|
||||
'Content-Type': 'application/json',
|
||||
'Authorization': f'token {token}'
|
||||
}
|
||||
data = {'title': title, 'body': body}
|
||||
|
||||
resp = requests.post(api_url, headers=headers, json=data, timeout=15)
|
||||
if resp.status_code in (200, 201):
|
||||
return True, resp.json().get('html_url')
|
||||
else:
|
||||
return False, f"gitea api error: {resp.status_code} - {resp.text[:200]}"
|
||||
|
||||
elif platform_type == 'gitlab':
|
||||
token = GITLAB_TOKENS.get(instance_url)
|
||||
if not token:
|
||||
return False, f"no auth token for {instance_url}"
|
||||
|
||||
# need to get project ID first
|
||||
search_url = f"{instance_url}/api/v4/projects"
|
||||
headers = {'PRIVATE-TOKEN': token}
|
||||
params = {'search': repo}
|
||||
|
||||
resp = requests.get(search_url, headers=headers, params=params, timeout=15)
|
||||
if resp.status_code != 200:
|
||||
return False, f"gitlab project lookup failed: {resp.status_code}"
|
||||
|
||||
projects = resp.json()
|
||||
project_id = None
|
||||
for p in projects:
|
||||
if p.get('path') == repo or p.get('name') == repo:
|
||||
project_id = p.get('id')
|
||||
break
|
||||
|
||||
if not project_id:
|
||||
return False, f"project {repo} not found"
|
||||
|
||||
# create issue
|
||||
issue_url = f"{instance_url}/api/v4/projects/{project_id}/issues"
|
||||
data = {'title': title, 'description': body}
|
||||
resp = requests.post(issue_url, headers=headers, json=data, timeout=15)
|
||||
|
||||
if resp.status_code in (200, 201):
|
||||
return True, resp.json().get('web_url')
|
||||
else:
|
||||
return False, f"gitlab api error: {resp.status_code}"
|
||||
|
||||
elif platform_type == 'sourcehut':
|
||||
return False, "sourcehut uses mailing lists - use email instead"
|
||||
|
||||
else:
|
||||
return False, f"unknown forge type: {platform_type}"
|
||||
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
|
||||
def send_mastodon_dm(recipient_acct, message, dry_run=False):
|
||||
"""send mastodon direct message"""
|
||||
if not MASTODON_TOKEN:
|
||||
|
|
@ -348,12 +429,13 @@ def determine_best_contact(human):
|
|||
return method, info
|
||||
|
||||
|
||||
def deliver_intro(match_data, intro_draft, dry_run=False):
|
||||
def deliver_intro(match_data, intro_draft, subject=None, dry_run=False):
|
||||
"""
|
||||
deliver an intro via the best available method
|
||||
|
||||
match_data: {human_a, human_b, overlap_score, overlap_reasons}
|
||||
intro_draft: the text to send (from groq)
|
||||
subject: optional subject line for email/github (from groq)
|
||||
"""
|
||||
recipient = match_data.get('human_b', {})
|
||||
recipient_id = f"{recipient.get('platform')}:{recipient.get('username')}"
|
||||
|
|
@ -365,6 +447,10 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
# determine contact method
|
||||
method, contact_info = determine_best_contact(recipient)
|
||||
|
||||
# if no contact method found, skip (will retry after deeper scraping)
|
||||
if method is None:
|
||||
return False, "no contact method found - needs deeper scraping", None
|
||||
|
||||
log = load_delivery_log()
|
||||
result = {
|
||||
'recipient_id': recipient_id,
|
||||
|
|
@ -373,14 +459,15 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
'contact_info': contact_info,
|
||||
'overlap_score': match_data.get('overlap_score'),
|
||||
'timestamp': datetime.now().isoformat(),
|
||||
'draft': intro_draft, # store the actual message sent
|
||||
}
|
||||
|
||||
success = False
|
||||
error = None
|
||||
|
||||
if method == 'email':
|
||||
subject = f"someone you might want to know - connectd"
|
||||
success, error = send_email(contact_info, subject, intro_draft, dry_run)
|
||||
email_subject = subject or "connecting builders - someone you might want to know"
|
||||
success, error = send_email(contact_info, email_subject, intro_draft, dry_run)
|
||||
|
||||
elif method == 'mastodon':
|
||||
success, error = send_mastodon_dm(contact_info, intro_draft, dry_run)
|
||||
|
|
@ -402,7 +489,7 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
elif method == 'github_issue':
|
||||
owner = contact_info.get('owner')
|
||||
repo = contact_info.get('repo')
|
||||
title = "community introduction from connectd"
|
||||
title = subject or "community introduction from connectd"
|
||||
# format for github
|
||||
github_body = f"""hey {recipient.get('name') or recipient.get('username')},
|
||||
|
||||
|
|
@ -413,19 +500,94 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
"""
|
||||
success, error = create_github_issue(owner, repo, title, github_body, dry_run)
|
||||
|
||||
elif method == 'forge_issue':
|
||||
# self-hosted git forge issue (gitea/forgejo/gitlab/sourcehut)
|
||||
platform_type = contact_info.get('platform_type')
|
||||
instance_url = contact_info.get('instance_url')
|
||||
owner = contact_info.get('owner')
|
||||
repo = contact_info.get('repo')
|
||||
title = subject or "community introduction from connectd"
|
||||
|
||||
# get the other person's contact info for bidirectional link
|
||||
sender = match_data.get('human_a', {})
|
||||
sender_name = sender.get('name') or sender.get('username') or 'someone'
|
||||
sender_platform = sender.get('platform', '')
|
||||
sender_url = sender.get('url', '')
|
||||
|
||||
if not sender_url:
|
||||
if sender_platform == 'github':
|
||||
sender_url = f"https://github.com/{sender.get('username')}"
|
||||
elif sender_platform == 'mastodon':
|
||||
sender_url = f"https://fosstodon.org/@{sender.get('username')}"
|
||||
elif ':' in sender_platform: # forge platform
|
||||
extra = sender.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
import json as _json
|
||||
extra = _json.loads(extra) if extra else {}
|
||||
sender_url = extra.get('instance_url', '') + '/' + sender.get('username', '')
|
||||
|
||||
forge_body = f"""hey {recipient.get('name') or recipient.get('username')},
|
||||
|
||||
{intro_draft}
|
||||
|
||||
**reach them at:** {sender_url or 'see their profile'}
|
||||
|
||||
---
|
||||
*this is an automated introduction from [connectd](https://github.com/connectd-daemon) - a daemon that finds isolated builders with aligned values and connects them.*
|
||||
|
||||
*if this feels spammy, close this issue and we won't reach out again.*
|
||||
"""
|
||||
success, error = create_forge_issue(platform_type, instance_url, owner, repo, title, forge_body, dry_run)
|
||||
|
||||
elif method == 'manual':
|
||||
# add to review queue
|
||||
add_to_manual_queue({
|
||||
'match': match_data,
|
||||
'draft': intro_draft,
|
||||
'recipient': recipient,
|
||||
})
|
||||
# skip - no longer using manual queue
|
||||
success = False
|
||||
error = "manual method deprecated - skipping"
|
||||
|
||||
# FALLBACK CHAIN: if primary method failed, try fallbacks
|
||||
if not success and fallbacks:
|
||||
for fallback_method, fallback_info in fallbacks:
|
||||
result['fallback_attempts'] = result.get('fallback_attempts', [])
|
||||
result['fallback_attempts'].append({'method': fallback_method})
|
||||
|
||||
fb_success = False
|
||||
fb_error = None
|
||||
|
||||
if fallback_method == 'email':
|
||||
fb_success, fb_error = send_email(fallback_info, email_subject, intro_draft, dry_run)
|
||||
elif fallback_method == 'mastodon':
|
||||
fb_success, fb_error = send_mastodon_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'bluesky':
|
||||
fb_success, fb_error = send_bluesky_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'matrix':
|
||||
fb_success, fb_error = send_matrix_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'github_issue':
|
||||
owner = fallback_info.get('owner') if isinstance(fallback_info, dict) else fallback_info.split('/')[0]
|
||||
repo = fallback_info.get('repo') if isinstance(fallback_info, dict) else fallback_info.split('/')[1]
|
||||
fb_success, fb_error = create_github_issue(owner, repo, email_subject, intro_draft, dry_run)
|
||||
elif fallback_method == 'forge_issue':
|
||||
fb_success, fb_error = create_forge_issue(
|
||||
fallback_info.get('platform_type'),
|
||||
fallback_info.get('instance_url'),
|
||||
fallback_info.get('owner'),
|
||||
fallback_info.get('repo'),
|
||||
email_subject, intro_draft, dry_run
|
||||
)
|
||||
|
||||
if fb_success:
|
||||
success = True
|
||||
error = "added to manual queue"
|
||||
method = fallback_method
|
||||
contact_info = fallback_info
|
||||
error = None
|
||||
result['fallback_succeeded'] = fallback_method
|
||||
break
|
||||
else:
|
||||
result['fallback_attempts'][-1]['error'] = fb_error
|
||||
|
||||
# log result
|
||||
result['success'] = success
|
||||
result['error'] = error
|
||||
result['final_method'] = method
|
||||
|
||||
if success:
|
||||
log['sent'].append(result)
|
||||
|
|
@ -7,10 +7,21 @@ services:
|
|||
- .env
|
||||
ports:
|
||||
- "8099:8099"
|
||||
extra_hosts:
|
||||
- "mastodon.sudoxreboot.com:192.168.1.39"
|
||||
volumes:
|
||||
- ./data:/app/data
|
||||
- ./db:/app/db
|
||||
# daemon runs continuously by default
|
||||
# for one-shot or dry-run, override command:
|
||||
# command: ["python", "daemon.py", "--dry-run"]
|
||||
# command: ["python", "cli.py", "scout"]
|
||||
- ./data_db:/data/db
|
||||
- ./daemon.py:/app/daemon.py:ro
|
||||
- ./deep.py:/app/scoutd/deep.py:ro
|
||||
- ./db_init.py:/app/db/__init__.py:ro
|
||||
- ./config.py:/app/config.py:ro
|
||||
- ./groq_draft.py:/app/introd/groq_draft.py:ro
|
||||
- ./api.py:/app/api.py:ro
|
||||
- ./deliver.py:/app/introd/deliver.py:ro
|
||||
- ./soul.txt:/app/soul.txt:ro
|
||||
- ./scoutd/reddit.py:/app/scoutd/reddit.py:ro
|
||||
- ./matchd/overlap.py:/app/matchd/overlap.py:ro
|
||||
- ./central_client.py:/app/central_client.py:ro
|
||||
- ./scoutd/forges.py:/app/scoutd/forges.py:ro
|
||||
|
|
|
|||
BIN
favicon.png
Normal file
BIN
favicon.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.4 MiB |
419
groq_draft.py
Normal file
419
groq_draft.py
Normal file
|
|
@ -0,0 +1,419 @@
|
|||
"""
|
||||
connectd - groq message drafting
|
||||
reads soul from file, uses as guideline for llm to personalize
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
from groq import Groq
|
||||
|
||||
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
|
||||
GROQ_MODEL = os.getenv("GROQ_MODEL", "llama-3.3-70b-versatile")
|
||||
|
||||
client = Groq(api_key=GROQ_API_KEY) if GROQ_API_KEY else None
|
||||
|
||||
# load soul from file (guideline, not script)
|
||||
SOUL_PATH = os.getenv("SOUL_PATH", "/app/soul.txt")
|
||||
def load_soul():
|
||||
try:
|
||||
with open(SOUL_PATH, 'r') as f:
|
||||
return f.read().strip()
|
||||
except:
|
||||
return None
|
||||
|
||||
SIGNATURE_HTML = """
|
||||
<div style="margin-top: 24px; padding-top: 16px; border-top: 1px solid #333;">
|
||||
<div style="margin-bottom: 12px;">
|
||||
<a href="https://github.com/sudoxnym/connectd" style="color: #8b5cf6; text-decoration: none; font-size: 14px;">github.com/sudoxnym/connectd</a>
|
||||
<span style="color: #666; font-size: 12px; margin-left: 8px;">(main repo)</span>
|
||||
</div>
|
||||
<div style="display: flex; gap: 16px; align-items: center;">
|
||||
<a href="https://github.com/connectd-daemon" title="GitHub" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg>
|
||||
</a>
|
||||
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M23.268 5.313c-.35-2.578-2.617-4.61-5.304-5.004C17.51.242 15.792 0 11.813 0h-.03c-3.98 0-4.835.242-5.288.309C3.882.692 1.496 2.518.917 5.127.64 6.412.61 7.837.661 9.143c.074 1.874.088 3.745.26 5.611.118 1.24.325 2.47.62 3.68.55 2.237 2.777 4.098 4.96 4.857 2.336.792 4.849.923 7.256.38.265-.061.527-.132.786-.213.585-.184 1.27-.39 1.774-.753a.057.057 0 0 0 .023-.043v-1.809a.052.052 0 0 0-.02-.041.053.053 0 0 0-.046-.01 20.282 20.282 0 0 1-4.709.545c-2.73 0-3.463-1.284-3.674-1.818a5.593 5.593 0 0 1-.319-1.433.053.053 0 0 1 .066-.054c1.517.363 3.072.546 4.632.546.376 0 .75 0 1.125-.01 1.57-.044 3.224-.124 4.768-.422.038-.008.077-.015.11-.024 2.435-.464 4.753-1.92 4.989-5.604.008-.145.03-1.52.03-1.67.002-.512.167-3.63-.024-5.545zm-3.748 9.195h-2.561V8.29c0-1.309-.55-1.976-1.67-1.976-1.23 0-1.846.79-1.846 2.35v3.403h-2.546V8.663c0-1.56-.617-2.35-1.848-2.35-1.112 0-1.668.668-1.67 1.977v6.218H4.822V8.102c0-1.31.337-2.35 1.011-3.12.696-.77 1.608-1.164 2.74-1.164 1.311 0 2.302.5 2.962 1.498l.638 1.06.638-1.06c.66-.999 1.65-1.498 2.96-1.498 1.13 0 2.043.395 2.74 1.164.675.77 1.012 1.81 1.012 3.12z"/></svg>
|
||||
</a>
|
||||
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M5.202 2.857C7.954 4.922 10.913 9.11 12 11.358c1.087-2.247 4.046-6.436 6.798-8.501C20.783 1.366 24 .213 24 3.883c0 .732-.42 6.156-.667 7.037-.856 3.061-3.978 3.842-6.755 3.37 4.854.826 6.089 3.562 3.422 6.299-5.065 5.196-7.28-1.304-7.847-2.97-.104-.305-.152-.448-.153-.327 0-.121-.05.022-.153.327-.568 1.666-2.782 8.166-7.847 2.97-2.667-2.737-1.432-5.473 3.422-6.3-2.777.473-5.899-.308-6.755-3.369C.42 10.04 0 4.615 0 3.883c0-3.67 3.217-2.517 5.202-1.026"/></svg>
|
||||
</a>
|
||||
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M2.9595 4.2228a3.9132 3.9132 0 0 0-.332.019c-.8781.1012-1.67.5699-2.155 1.3862-.475.8-.5922 1.6809-.35 2.4971.2421.8162.8297 1.5575 1.6982 2.1449.0053.0035.0106.0076.0163.0114.746.4498 1.492.7431 2.2877.8994-.02.3318-.0272.6689-.006 1.0181.0634 1.0432.4368 2.0006.996 2.8492l-2.0061.8189a.4163.4163 0 0 0-.2276.2239.416.416 0 0 0 .0879.455.415.415 0 0 0 .2941.1231.4156.4156 0 0 0 .1595-.0312l2.2093-.9035c.408.4859.8695.9315 1.3723 1.318.0196.0151.0407.0264.0603.0423l-1.2918 1.7103a.416.416 0 0 0 .664.501l1.314-1.7385c.7185.4548 1.4782.7927 2.2294 1.0242.3833.7209 1.1379 1.1871 2.0202 1.1871.8907 0 1.6442-.501 2.0242-1.2072.744-.2347 1.4959-.5729 2.2073-1.0262l1.332 1.7606a.4157.4157 0 0 0 .7439-.1936.4165.4165 0 0 0-.0799-.3074l-1.3099-1.7345c.0083-.0075.0178-.0113.0261-.0188.4968-.3803.9549-.8175 1.3622-1.2939l2.155.8794a.4156.4156 0 0 0 .5412-.2276.4151.4151 0 0 0-.2273-.5432l-1.9438-.7928c.577-.8538.9697-1.8183 1.0504-2.8693.0268-.3507.0242-.6914.0079-1.0262.7905-.1572 1.5321-.4502 2.2737-.8974.0053-.0033.011-.0076.0163-.0113.8684-.5874 1.456-1.3287 1.6982-2.145.2421-.8161.125-1.697-.3501-2.497-.4849-.8163-1.2768-1.2852-2.155-1.3863a3.2175 3.2175 0 0 0-.332-.0189c-.7852-.0151-1.6231.229-2.4286.6942-.5926.342-1.1252.867-1.5433 1.4387-1.1699-.6703-2.6923-1.0476-4.5635-1.0785a15.5768 15.5768 0 0 0-.5111 0c-2.085.034-3.7537.43-5.0142 1.1449-.0033-.0038-.0045-.0114-.008-.0152-.4233-.5916-.973-1.1365-1.5835-1.489-.8055-.465-1.6434-.7083-2.4286-.6941Zm.2858.7365c.5568.042 1.1696.2358 1.7787.5875.485.28.9757.7554 1.346 1.2696a5.6875 5.6875 0 0 0-.4969.4085c-.9201.8516-1.4615 1.9597-1.668 3.2335-.6809-.1402-1.3183-.3945-1.984-.7948-.7553-.5128-1.2159-1.1225-1.4004-1.7445-.1851-.624-.1074-1.2712.2776-1.9196.3743-.63.9275-.9534 1.6118-1.0322a2.796 2.796 0 0 1 .5352-.0076Zm17.5094 0a2.797 2.797 0 0 1 .5353.0075c.6842.0786 1.2374.4021 1.6117 1.0322.385.6484.4627 1.2957.2776 1.9196-.1845.622-.645 1.2317-1.4004 1.7445-.6578.3955-1.2881.6472-1.9598.7888-.1942-1.2968-.7375-2.4338-1.666-3.302a5.5639 5.5639 0 0 0-.4709-.3923c.3645-.49.8287-.9428 1.2938-1.2113.6091-.3515 1.2219-.5454 1.7787-.5875ZM12.006 6.0036a14.832 14.832 0 0 1 .487 0c2.3901.0393 4.0848.67 5.1631 1.678 1.1501 1.0754 1.6423 2.6006 1.499 4.467-.1311 1.7079-1.2203 3.2281-2.652 4.324-.694.5313-1.4626.9354-2.2254 1.2294.0031-.0453.014-.0888.014-.1349.0029-1.1964-.9313-2.2133-2.2918-2.2133-1.3606 0-2.3222 1.0154-2.2918 2.2213.0013.0507.014.0972.0181.1471-.781-.2933-1.5696-.7013-2.2777-1.2456-1.4239-1.0945-2.4997-2.6129-2.6037-4.322-.1129-1.8567.3778-3.3382 1.5212-4.3965C7.5094 6.7 9.352 6.047 12.006 6.0036Zm-3.6419 6.8291c-.6053 0-1.0966.4903-1.0966 1.0966 0 .6063.4913 1.0986 1.0966 1.0986s1.0966-.4923 1.0966-1.0986c0-.6063-.4913-1.0966-1.0966-1.0966zm7.2819.0113c-.5998 0-1.0866.4859-1.0866 1.0866s.4868 1.0885 1.0866 1.0885c.5997 0 1.0865-.4878 1.0865-1.0885s-.4868-1.0866-1.0865-1.0866zM12 16.0835c1.0237 0 1.5654.638 1.5634 1.4829-.0018.7849-.6723 1.485-1.5634 1.485-.9167 0-1.54-.5629-1.5634-1.493-.0212-.8347.5397-1.4749 1.5634-1.4749Z"/></svg>
|
||||
</a>
|
||||
<a href="https://discord.gg/connectd" title="Discord" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M20.317 4.3698a19.7913 19.7913 0 00-4.8851-1.5152.0741.0741 0 00-.0785.0371c-.211.3753-.4447.8648-.6083 1.2495-1.8447-.2762-3.68-.2762-5.4868 0-.1636-.3933-.4058-.8742-.6177-1.2495a.077.077 0 00-.0785-.037 19.7363 19.7363 0 00-4.8852 1.515.0699.0699 0 00-.0321.0277C.5334 9.0458-.319 13.5799.0992 18.0578a.0824.0824 0 00.0312.0561c2.0528 1.5076 4.0413 2.4228 5.9929 3.0294a.0777.0777 0 00.0842-.0276c.4616-.6304.8731-1.2952 1.226-1.9942a.076.076 0 00-.0416-.1057c-.6528-.2476-1.2743-.5495-1.8722-.8923a.077.077 0 01-.0076-.1277c.1258-.0943.2517-.1923.3718-.2914a.0743.0743 0 01.0776-.0105c3.9278 1.7933 8.18 1.7933 12.0614 0a.0739.0739 0 01.0785.0095c.1202.099.246.1981.3728.2924a.077.077 0 01-.0066.1276 12.2986 12.2986 0 01-1.873.8914.0766.0766 0 00-.0407.1067c.3604.698.7719 1.3628 1.225 1.9932a.076.076 0 00.0842.0286c1.961-.6067 3.9495-1.5219 6.0023-3.0294a.077.077 0 00.0313-.0552c.5004-5.177-.8382-9.6739-3.5485-13.6604a.061.061 0 00-.0312-.0286zM8.02 15.3312c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9555-2.4189 2.157-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.9555 2.4189-2.1569 2.4189zm7.9748 0c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9554-2.4189 2.1569-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.946 2.4189-2.1568 2.4189Z"/></svg>
|
||||
</a>
|
||||
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M.632.55v22.9H2.28V24H0V0h2.28v.55zm7.043 7.26v1.157h.033c.309-.443.683-.784 1.117-1.024.433-.245.936-.365 1.5-.365.54 0 1.033.107 1.481.314.448.208.785.582 1.02 1.108.254-.374.6-.706 1.034-.992.434-.287.95-.43 1.546-.43.453 0 .872.056 1.26.167.388.11.716.286.993.53.276.245.489.559.646.951.152.392.23.863.23 1.417v5.728h-2.349V11.52c0-.286-.01-.559-.032-.812a1.755 1.755 0 0 0-.18-.66 1.106 1.106 0 0 0-.438-.448c-.194-.11-.457-.166-.785-.166-.332 0-.6.064-.803.189a1.38 1.38 0 0 0-.48.499 1.946 1.946 0 0 0-.231.696 5.56 5.56 0 0 0-.06.785v4.768h-2.35v-4.8c0-.254-.004-.503-.018-.752a2.074 2.074 0 0 0-.143-.688 1.052 1.052 0 0 0-.415-.503c-.194-.125-.476-.19-.854-.19-.111 0-.259.024-.439.074-.18.051-.36.143-.53.282-.171.138-.319.337-.439.595-.12.259-.18.6-.18 1.02v4.966H5.46V7.81zm15.693 15.64V.55H21.72V0H24v24h-2.28v-.55z"/></svg>
|
||||
</a>
|
||||
<a href="https://reddit.com/r/connectd" title="Reddit" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 0C5.373 0 0 5.373 0 12c0 3.314 1.343 6.314 3.515 8.485l-2.286 2.286C.775 23.225 1.097 24 1.738 24H12c6.627 0 12-5.373 12-12S18.627 0 12 0Zm4.388 3.199c1.104 0 1.999.895 1.999 1.999 0 1.105-.895 2-1.999 2-.946 0-1.739-.657-1.947-1.539v.002c-1.147.162-2.032 1.15-2.032 2.341v.007c1.776.067 3.4.567 4.686 1.363.473-.363 1.064-.58 1.707-.58 1.547 0 2.802 1.254 2.802 2.802 0 1.117-.655 2.081-1.601 2.531-.088 3.256-3.637 5.876-7.997 5.876-4.361 0-7.905-2.617-7.998-5.87-.954-.447-1.614-1.415-1.614-2.538 0-1.548 1.255-2.802 2.803-2.802.645 0 1.239.218 1.712.585 1.275-.79 2.881-1.291 4.64-1.365v-.01c0-1.663 1.263-3.034 2.88-3.207.188-.911.993-1.595 1.959-1.595Zm-8.085 8.376c-.784 0-1.459.78-1.506 1.797-.047 1.016.64 1.429 1.426 1.429.786 0 1.371-.369 1.418-1.385.047-1.017-.553-1.841-1.338-1.841Zm7.406 0c-.786 0-1.385.824-1.338 1.841.047 1.017.634 1.385 1.418 1.385.785 0 1.473-.413 1.426-1.429-.046-1.017-.721-1.797-1.506-1.797Zm-3.703 4.013c-.974 0-1.907.048-2.77.135-.147.015-.241.168-.183.305.483 1.154 1.622 1.964 2.953 1.964 1.33 0 2.47-.81 2.953-1.964.057-.137-.037-.29-.184-.305-.863-.087-1.795-.135-2.769-.135Z"/></svg>
|
||||
</a>
|
||||
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M1.5 8.67v8.58a3 3 0 003 3h15a3 3 0 003-3V8.67l-8.928 5.493a3 3 0 01-3.144 0L1.5 8.67z"/><path d="M22.5 6.908V6.75a3 3 0 00-3-3h-15a3 3 0 00-3 3v.158l9.714 5.978a1.5 1.5 0 001.572 0L22.5 6.908z"/></svg>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
SIGNATURE_PLAINTEXT = """
|
||||
---
|
||||
github.com/sudoxnym/connectd (main repo)
|
||||
|
||||
github: github.com/connectd-daemon
|
||||
mastodon: @connectd@mastodon.sudoxreboot.com
|
||||
bluesky: connectd.bsky.social
|
||||
lemmy: lemmy.sudoxreboot.com/c/connectd
|
||||
discord: discord.gg/connectd
|
||||
matrix: @connectd:sudoxreboot.com
|
||||
reddit: reddit.com/r/connectd
|
||||
email: connectd@sudoxreboot.com
|
||||
"""
|
||||
|
||||
|
||||
def draft_intro_with_llm(match_data: dict, recipient: str = 'a', dry_run: bool = True):
|
||||
"""
|
||||
draft an intro message using groq llm.
|
||||
|
||||
args:
|
||||
match_data: dict with human_a, human_b, overlap_score, overlap_reasons
|
||||
recipient: 'a' or 'b' - who receives the message
|
||||
dry_run: if True, preview mode
|
||||
|
||||
returns:
|
||||
tuple (result_dict, error_string)
|
||||
result_dict has: subject, draft_html, draft_plain
|
||||
"""
|
||||
if not client:
|
||||
return None, "GROQ_API_KEY not set"
|
||||
|
||||
try:
|
||||
human_a = match_data.get('human_a', {})
|
||||
human_b = match_data.get('human_b', {})
|
||||
reasons = match_data.get('overlap_reasons', [])
|
||||
|
||||
# recipient gets the message, about_person is who we're introducing them to
|
||||
if recipient == 'a':
|
||||
to_person = human_a
|
||||
about_person = human_b
|
||||
else:
|
||||
to_person = human_b
|
||||
about_person = human_a
|
||||
|
||||
to_name = to_person.get('username', 'friend')
|
||||
about_name = about_person.get('username', 'someone')
|
||||
about_bio = about_person.get('extra', {}).get('bio', '')
|
||||
|
||||
# extract contact info for about_person
|
||||
about_extra = about_person.get('extra', {})
|
||||
if isinstance(about_extra, str):
|
||||
import json as _json
|
||||
about_extra = _json.loads(about_extra) if about_extra else {}
|
||||
about_contact = about_person.get('contact', {})
|
||||
if isinstance(about_contact, str):
|
||||
about_contact = _json.loads(about_contact) if about_contact else {}
|
||||
|
||||
# build contact link for about_person
|
||||
about_platform = about_person.get('platform', '')
|
||||
about_username = about_person.get('username', '')
|
||||
contact_link = None
|
||||
if about_platform == 'mastodon' and about_username:
|
||||
if '@' in about_username:
|
||||
parts = about_username.split('@')
|
||||
if len(parts) >= 2:
|
||||
contact_link = f"https://{parts[1]}/@{parts[0]}"
|
||||
elif about_platform == 'github' and about_username:
|
||||
contact_link = f"https://github.com/{about_username}"
|
||||
elif about_extra.get('mastodon') or about_contact.get('mastodon'):
|
||||
handle = about_extra.get('mastodon') or about_contact.get('mastodon')
|
||||
if '@' in handle:
|
||||
parts = handle.lstrip('@').split('@')
|
||||
if len(parts) >= 2:
|
||||
contact_link = f"https://{parts[1]}/@{parts[0]}"
|
||||
elif about_extra.get('github') or about_contact.get('github'):
|
||||
contact_link = f"https://github.com/{about_extra.get('github') or about_contact.get('github')}"
|
||||
elif about_extra.get('email'):
|
||||
contact_link = about_extra['email']
|
||||
elif about_contact.get('email'):
|
||||
contact_link = about_contact['email']
|
||||
elif about_extra.get('website'):
|
||||
contact_link = about_extra['website']
|
||||
elif about_extra.get('external_links', {}).get('website'):
|
||||
contact_link = about_extra['external_links']['website']
|
||||
elif about_extra.get('extra', {}).get('website'):
|
||||
contact_link = about_extra['extra']['website']
|
||||
elif about_platform == 'reddit' and about_username:
|
||||
contact_link = f"reddit.com/u/{about_username}"
|
||||
|
||||
if not contact_link:
|
||||
contact_link = f"github.com/{about_username}" if about_username else "reach out via connectd"
|
||||
|
||||
# skip if no real contact method (just reddit or generic)
|
||||
if contact_link.startswith('reddit.com') or contact_link == "reach out via connectd" or 'stackblitz' in contact_link:
|
||||
return None, f"no real contact info for {about_name} - skipping draft"
|
||||
|
||||
# format the shared factors naturally
|
||||
if reasons:
|
||||
factor = ', '.join(reasons[:3]) if len(reasons) > 1 else reasons[0]
|
||||
else:
|
||||
factor = "shared values and interests"
|
||||
|
||||
# load soul as guideline
|
||||
soul = load_soul()
|
||||
if not soul:
|
||||
return None, "could not load soul file"
|
||||
|
||||
# build the prompt - soul is GUIDELINE not script
|
||||
prompt = f"""you are connectd, a daemon that finds isolated builders and connects them.
|
||||
|
||||
write a personal message TO {to_name} telling them about {about_name}.
|
||||
|
||||
here is the soul/spirit of what connectd is about - use this as a GUIDELINE for tone and message, NOT as a script to copy verbatim:
|
||||
|
||||
---
|
||||
{soul}
|
||||
---
|
||||
|
||||
key facts for this message:
|
||||
- recipient: {to_name}
|
||||
- introducing them to: {about_name}
|
||||
- their shared interests/values: {factor}
|
||||
- about {about_name}: {about_bio if about_bio else 'a builder like you'}
|
||||
- HOW TO REACH {about_name}: {contact_link}
|
||||
|
||||
RULES:
|
||||
1. say their name ONCE at start, then use "you"
|
||||
2. MUST include how to reach {about_name}: {contact_link}
|
||||
3. lowercase, raw, emotional - follow the soul
|
||||
4. end with the contact link
|
||||
|
||||
return ONLY the message body. signature is added separately."""
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model=GROQ_MODEL,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=0.6,
|
||||
max_tokens=1200
|
||||
)
|
||||
|
||||
body = response.choices[0].message.content.strip()
|
||||
|
||||
# generate subject
|
||||
subject_prompt = f"""generate a short, lowercase email subject for a message to {to_name} about connecting them with {about_name} over their shared interest in {factor}.
|
||||
|
||||
no corporate speak. no clickbait. raw and real.
|
||||
examples:
|
||||
- "found you, {to_name}"
|
||||
- "you're not alone"
|
||||
- "a door just opened"
|
||||
- "{to_name}, there's someone you should meet"
|
||||
|
||||
return ONLY the subject line."""
|
||||
|
||||
subject_response = client.chat.completions.create(
|
||||
model=GROQ_MODEL,
|
||||
messages=[{"role": "user", "content": subject_prompt}],
|
||||
temperature=0.9,
|
||||
max_tokens=50
|
||||
)
|
||||
|
||||
subject = subject_response.choices[0].message.content.strip().strip('"').strip("'")
|
||||
|
||||
# format html
|
||||
draft_html = f"<div style='font-family: monospace; white-space: pre-wrap; color: #e0e0e0; background: #1a1a1a; padding: 20px;'>{body}</div>{SIGNATURE_HTML}"
|
||||
draft_plain = body + SIGNATURE_PLAINTEXT
|
||||
|
||||
return {
|
||||
'subject': subject,
|
||||
'draft_html': draft_html,
|
||||
'draft_plain': draft_plain
|
||||
}, None
|
||||
|
||||
except Exception as e:
|
||||
return None, str(e)
|
||||
|
||||
|
||||
# for backwards compat with old code
|
||||
def draft_message(person: dict, factor: str, platform: str = "email") -> dict:
|
||||
"""legacy function - wraps new api"""
|
||||
match_data = {
|
||||
'human_a': {'username': 'recipient'},
|
||||
'human_b': person,
|
||||
'overlap_reasons': [factor]
|
||||
}
|
||||
result, error = draft_intro_with_llm(match_data, recipient='a')
|
||||
if error:
|
||||
raise ValueError(error)
|
||||
return {
|
||||
'subject': result['subject'],
|
||||
'body_html': result['draft_html'],
|
||||
'body_plain': result['draft_plain']
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# test
|
||||
test_data = {
|
||||
'human_a': {'username': 'sudoxnym', 'extra': {'bio': 'building intentional communities'}},
|
||||
'human_b': {'username': 'testuser', 'extra': {'bio': 'home assistant enthusiast'}},
|
||||
'overlap_reasons': ['home-assistant', 'open source', 'community building']
|
||||
}
|
||||
result, error = draft_intro_with_llm(test_data, recipient='a')
|
||||
if error:
|
||||
print(f"error: {error}")
|
||||
else:
|
||||
print(f"subject: {result['subject']}")
|
||||
print(f"\nbody:\n{result['draft_plain']}")
|
||||
|
||||
# contact method ranking - USAGE BASED
|
||||
# we rank by where the person is MOST ACTIVE, not by our preference
|
||||
|
||||
def determine_contact_method(human):
|
||||
"""
|
||||
determine ALL available contact methods, ranked by USER'S ACTIVITY.
|
||||
|
||||
looks at activity metrics to decide where they're most engaged.
|
||||
returns: (best_method, best_info, fallbacks)
|
||||
where fallbacks is a list of (method, info) tuples in activity order
|
||||
"""
|
||||
import json
|
||||
|
||||
extra = human.get('extra', {})
|
||||
contact = human.get('contact', {})
|
||||
|
||||
if isinstance(extra, str):
|
||||
extra = json.loads(extra) if extra else {}
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact) if contact else {}
|
||||
|
||||
nested_extra = extra.get('extra', {})
|
||||
platform = human.get('platform', '')
|
||||
|
||||
available = []
|
||||
|
||||
# === ACTIVITY SCORING ===
|
||||
# each method gets scored by how active the user is there
|
||||
|
||||
# EMAIL - always medium priority (we cant measure activity)
|
||||
email = extra.get('email') or contact.get('email') or nested_extra.get('email')
|
||||
if email and '@' in str(email):
|
||||
available.append(('email', email, 50)) # baseline score
|
||||
|
||||
# MASTODON - score by post count / followers
|
||||
mastodon = extra.get('mastodon') or contact.get('mastodon') or nested_extra.get('mastodon')
|
||||
if mastodon:
|
||||
masto_activity = extra.get('mastodon_posts', 0) or extra.get('statuses_count', 0)
|
||||
masto_score = min(100, 30 + (masto_activity // 10)) # 30 base + 1 per 10 posts
|
||||
available.append(('mastodon', mastodon, masto_score))
|
||||
|
||||
# if they CAME FROM mastodon, thats their primary
|
||||
if platform == 'mastodon':
|
||||
handle = f"@{human.get('username')}"
|
||||
instance = human.get('instance') or extra.get('instance') or ''
|
||||
if instance:
|
||||
handle = f"@{human.get('username')}@{instance}"
|
||||
activity = extra.get('statuses_count', 0) or extra.get('activity_count', 0)
|
||||
score = min(100, 50 + (activity // 5)) # higher base since its their home
|
||||
# dont dupe
|
||||
if not any(a[0] == 'mastodon' for a in available):
|
||||
available.append(('mastodon', handle, score))
|
||||
else:
|
||||
# update score if this is higher
|
||||
for i, (m, info, s) in enumerate(available):
|
||||
if m == 'mastodon' and score > s:
|
||||
available[i] = ('mastodon', handle, score)
|
||||
|
||||
# MATRIX - score by presence (binary for now)
|
||||
matrix = extra.get('matrix') or contact.get('matrix') or nested_extra.get('matrix')
|
||||
if matrix and ':' in str(matrix):
|
||||
available.append(('matrix', matrix, 40))
|
||||
|
||||
# BLUESKY - score by followers/posts if available
|
||||
bluesky = extra.get('bluesky') or contact.get('bluesky') or nested_extra.get('bluesky')
|
||||
if bluesky:
|
||||
bsky_activity = extra.get('bluesky_posts', 0)
|
||||
bsky_score = min(100, 25 + (bsky_activity // 10))
|
||||
available.append(('bluesky', bluesky, bsky_score))
|
||||
|
||||
# LEMMY - score by activity
|
||||
lemmy = extra.get('lemmy') or contact.get('lemmy') or nested_extra.get('lemmy')
|
||||
if lemmy:
|
||||
lemmy_activity = extra.get('lemmy_posts', 0) or extra.get('lemmy_comments', 0)
|
||||
lemmy_score = min(100, 30 + lemmy_activity)
|
||||
available.append(('lemmy', lemmy, lemmy_score))
|
||||
|
||||
if platform == 'lemmy':
|
||||
handle = human.get('username')
|
||||
activity = extra.get('activity_count', 0)
|
||||
score = min(100, 50 + activity)
|
||||
if not any(a[0] == 'lemmy' for a in available):
|
||||
available.append(('lemmy', handle, score))
|
||||
|
||||
# DISCORD - lower priority (hard to DM)
|
||||
discord = extra.get('discord') or contact.get('discord') or nested_extra.get('discord')
|
||||
if discord:
|
||||
available.append(('discord', discord, 20))
|
||||
|
||||
# GITHUB ISSUE - for github users, score by repo activity
|
||||
if platform == 'github':
|
||||
top_repos = extra.get('top_repos', [])
|
||||
if top_repos:
|
||||
repo = top_repos[0] if isinstance(top_repos[0], str) else top_repos[0].get('name', '')
|
||||
stars = extra.get('total_stars', 0)
|
||||
repos_count = extra.get('repos_count', 0)
|
||||
# active github user = higher issue score
|
||||
gh_score = min(60, 20 + (stars // 100) + (repos_count // 5))
|
||||
if repo:
|
||||
available.append(('github_issue', f"{human.get('username')}/{repo}", gh_score))
|
||||
|
||||
# FORGE ISSUE - for self-hosted git users (gitea/forgejo/gitlab/sourcehut/codeberg)
|
||||
# these are HIGH SIGNAL users - they actually selfhost
|
||||
if platform and ':' in platform:
|
||||
platform_type, instance = platform.split(':', 1)
|
||||
if platform_type in ('gitea', 'forgejo', 'gogs', 'gitlab', 'sourcehut'):
|
||||
repos = extra.get('repos', [])
|
||||
if repos:
|
||||
repo = repos[0] if isinstance(repos[0], str) else repos[0].get('name', '')
|
||||
instance_url = extra.get('instance_url', '')
|
||||
if repo and instance_url:
|
||||
# forge users get higher priority than github (they selfhost!)
|
||||
forge_score = 55 # higher than github_issue (50)
|
||||
available.append(('forge_issue', {
|
||||
'platform_type': platform_type,
|
||||
'instance': instance,
|
||||
'instance_url': instance_url,
|
||||
'owner': human.get('username'),
|
||||
'repo': repo
|
||||
}, forge_score))
|
||||
|
||||
# REDDIT - discovered people, use their other links
|
||||
if platform == 'reddit':
|
||||
reddit_activity = extra.get('reddit_activity', 0) or extra.get('activity_count', 0)
|
||||
# reddit users we reach via their external links (email, mastodon, etc)
|
||||
# boost their other methods if reddit is their main platform
|
||||
for i, (m, info, score) in enumerate(available):
|
||||
if m in ('email', 'mastodon', 'matrix', 'bluesky'):
|
||||
# boost score for reddit-discovered users' external contacts
|
||||
boost = min(30, reddit_activity // 3)
|
||||
available[i] = (m, info, score + boost)
|
||||
|
||||
# sort by activity score (highest first)
|
||||
available.sort(key=lambda x: x[2], reverse=True)
|
||||
|
||||
if not available:
|
||||
return 'manual', None, []
|
||||
|
||||
best = available[0]
|
||||
fallbacks = [(m, i) for m, i, p in available[1:]]
|
||||
|
||||
return best[0], best[1], fallbacks
|
||||
|
||||
|
||||
def get_ranked_contact_methods(human):
|
||||
"""
|
||||
get all contact methods for a human, ranked by their activity.
|
||||
"""
|
||||
method, info, fallbacks = determine_contact_method(human)
|
||||
if method == 'manual':
|
||||
return []
|
||||
return [(method, info)] + fallbacks
|
||||
|
|
@ -293,7 +293,28 @@ repos: {len(to_extra.get('top_repos', []))} public repos
|
|||
languages: {', '.join(to_extra.get('languages', {}).keys())}
|
||||
"""
|
||||
|
||||
other_profile = f"""
|
||||
|
||||
# extract other person's best contact method
|
||||
other_contact = other_person.get('contact', {})
|
||||
if isinstance(other_contact, str):
|
||||
import json as j
|
||||
try:
|
||||
other_contact = j.loads(other_contact)
|
||||
except:
|
||||
other_contact = {}
|
||||
|
||||
# determine their preferred contact
|
||||
other_preferred = ''
|
||||
if other_contact.get('mastodon'):
|
||||
other_preferred = f"mastodon: {other_contact['mastodon']}"
|
||||
elif other_contact.get('github'):
|
||||
other_preferred = f"github: github.com/{other_contact['github']}"
|
||||
elif other_contact.get('email'):
|
||||
other_preferred = f"email: {other_contact['email']}"
|
||||
elif other_person.get('url'):
|
||||
other_preferred = f"url: {other_person['url']}"
|
||||
|
||||
other_profile = f"""
|
||||
name: {other_name}
|
||||
platform: {other_person.get('platform', 'unknown')}
|
||||
bio: {other_person.get('bio') or 'no bio'}
|
||||
|
|
@ -302,6 +323,7 @@ signals: {', '.join(other_signals[:8])}
|
|||
repos: {len(other_extra.get('top_repos', []))} public repos
|
||||
languages: {', '.join(other_extra.get('languages', {}).keys())}
|
||||
url: {other_person.get('url', '')}
|
||||
contact: {other_preferred}
|
||||
"""
|
||||
|
||||
# build prompt
|
||||
|
|
@ -318,6 +340,7 @@ rules:
|
|||
- no emojis unless the person's profile suggests they'd like them
|
||||
- mention specific things from their profiles, not generic "you both like open source"
|
||||
- end with a simple invitation, not a hard sell
|
||||
- IMPORTANT: always tell them how to reach the other person (their contact info is provided)
|
||||
- sign off as "- connectd" (lowercase)
|
||||
|
||||
bad examples:
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
ARG BUILD_FROM
|
||||
FROM ${BUILD_FROM}
|
||||
|
||||
# install python deps
|
||||
RUN apk add --no-cache python3 py3-pip py3-requests py3-beautifulsoup4
|
||||
|
||||
# create app directory
|
||||
WORKDIR /app
|
||||
|
||||
# copy requirements and install
|
||||
COPY requirements.txt .
|
||||
RUN pip3 install --no-cache-dir --break-system-packages -r requirements.txt
|
||||
|
||||
# copy app code
|
||||
COPY api.py config.py daemon.py cli.py setup_user.py ./
|
||||
COPY db/ db/
|
||||
COPY scoutd/ scoutd/
|
||||
COPY matchd/ matchd/
|
||||
COPY introd/ introd/
|
||||
|
||||
# create data directory
|
||||
RUN mkdir -p /data/db /data/cache
|
||||
|
||||
# copy run script
|
||||
COPY run.sh /
|
||||
RUN chmod a+x /run.sh
|
||||
|
||||
CMD ["/run.sh"]
|
||||
|
|
@ -1,52 +0,0 @@
|
|||
# connectd add-on for home assistant
|
||||
|
||||
find isolated builders with aligned values. auto-discovers humans on github, mastodon, lemmy, discord, and more.
|
||||
|
||||
## installation
|
||||
|
||||
1. add this repository to your home assistant add-on store
|
||||
2. install the connectd add-on
|
||||
3. configure your HOST_USER (github username) in the add-on settings
|
||||
4. start the add-on
|
||||
|
||||
## configuration
|
||||
|
||||
### required
|
||||
- **host_user**: your github username (connectd will auto-discover your profile)
|
||||
|
||||
### optional host info
|
||||
- **host_name**: your display name
|
||||
- **host_email**: your email
|
||||
- **host_mastodon**: mastodon handle (@user@instance)
|
||||
- **host_reddit**: reddit username
|
||||
- **host_lemmy**: lemmy handle (@user@instance)
|
||||
- **host_lobsters**: lobsters username
|
||||
- **host_matrix**: matrix handle (@user:server)
|
||||
- **host_discord**: discord user id
|
||||
- **host_bluesky**: bluesky handle (handle.bsky.social)
|
||||
- **host_location**: your location
|
||||
- **host_interests**: comma-separated interests
|
||||
- **host_looking_for**: what you're looking for
|
||||
|
||||
### api credentials
|
||||
- **github_token**: for higher rate limits
|
||||
- **groq_api_key**: for LLM-drafted intros
|
||||
- **mastodon_token**: for DM delivery
|
||||
- **discord_bot_token**: for discord discovery/delivery
|
||||
|
||||
## hacs integration
|
||||
|
||||
after starting the add-on, install the connectd integration via HACS:
|
||||
|
||||
1. add custom repository: `https://github.com/sudoxnym/connectd`
|
||||
2. install connectd integration
|
||||
3. add integration in HA settings
|
||||
4. configure with host: `localhost`, port: `8099`
|
||||
|
||||
## sensors
|
||||
|
||||
- total humans, high score humans, active builders
|
||||
- platform counts (github, mastodon, reddit, lemmy, discord, lobsters)
|
||||
- priority matches, top humans
|
||||
- countdown timers (next scout, match, intro)
|
||||
- your personal score and profile
|
||||
|
|
@ -1,11 +0,0 @@
|
|||
build_from:
|
||||
amd64: ghcr.io/hassio-addons/base:15.0.8
|
||||
aarch64: ghcr.io/hassio-addons/base:15.0.8
|
||||
armv7: ghcr.io/hassio-addons/base:15.0.8
|
||||
labels:
|
||||
org.opencontainers.image.title: "connectd"
|
||||
org.opencontainers.image.description: "find isolated builders with aligned values"
|
||||
org.opencontainers.image.source: "https://github.com/sudoxnym/connectd"
|
||||
org.opencontainers.image.licenses: "MIT"
|
||||
args:
|
||||
BUILD_ARCH: amd64
|
||||
|
|
@ -1,878 +0,0 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
connectd - people discovery and matchmaking daemon
|
||||
finds isolated builders and connects them
|
||||
also finds LOST builders who need encouragement
|
||||
|
||||
usage:
|
||||
connectd scout # run all scrapers
|
||||
connectd scout --github # github only
|
||||
connectd scout --reddit # reddit only
|
||||
connectd scout --mastodon # mastodon only
|
||||
connectd scout --lobsters # lobste.rs only
|
||||
connectd scout --matrix # matrix only
|
||||
connectd scout --lost # show lost builder stats after scout
|
||||
|
||||
connectd match # find all matches
|
||||
connectd match --top 20 # show top 20 matches
|
||||
connectd match --mine # show YOUR matches (priority user)
|
||||
connectd match --lost # find matches for lost builders
|
||||
|
||||
connectd intro # generate intros for top matches
|
||||
connectd intro --match 123 # generate intro for specific match
|
||||
connectd intro --dry-run # preview intros without saving
|
||||
connectd intro --lost # generate intros for lost builders
|
||||
|
||||
connectd review # interactive review queue
|
||||
connectd send # send all approved intros
|
||||
connectd send --export # export for manual sending
|
||||
|
||||
connectd daemon # run as continuous daemon
|
||||
connectd daemon --oneshot # run once then exit
|
||||
connectd daemon --dry-run # run but never send intros
|
||||
connectd daemon --oneshot --dry-run # one cycle, preview only
|
||||
|
||||
connectd user # show your priority user profile
|
||||
connectd user --setup # setup/update your profile
|
||||
connectd user --matches # show matches found for you
|
||||
|
||||
connectd status # show database stats (including lost builders)
|
||||
connectd lost # show lost builders ready for outreach
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# add parent to path for imports
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from db import Database
|
||||
from db.users import (init_users_table, add_priority_user, get_priority_users,
|
||||
get_priority_user_matches, score_priority_user, auto_match_priority_user,
|
||||
update_priority_user_profile)
|
||||
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_matrix
|
||||
from scoutd.deep import deep_scrape_github_user
|
||||
from scoutd.lost import get_signal_descriptions
|
||||
from introd.deliver import (deliver_intro, deliver_batch, get_delivery_stats,
|
||||
review_manual_queue, determine_best_contact, load_manual_queue,
|
||||
save_manual_queue)
|
||||
from matchd import find_all_matches, generate_fingerprint
|
||||
from matchd.rank import get_top_matches
|
||||
from matchd.lost import find_matches_for_lost_builders, get_lost_match_summary
|
||||
from introd import draft_intro
|
||||
from introd.draft import draft_intros_for_match
|
||||
from introd.lost_intro import draft_lost_intro, get_lost_intro_config
|
||||
from introd.review import review_all_pending, get_pending_intros
|
||||
from introd.send import send_all_approved, export_manual_intros
|
||||
|
||||
|
||||
def cmd_scout(args, db):
|
||||
"""run discovery scrapers"""
|
||||
from scoutd.deep import deep_scrape_github_user, save_deep_profile
|
||||
|
||||
print("=" * 60)
|
||||
print("connectd scout - discovering aligned humans")
|
||||
print("=" * 60)
|
||||
|
||||
# deep scrape specific user
|
||||
if args.user:
|
||||
print(f"\ndeep scraping github user: {args.user}")
|
||||
profile = deep_scrape_github_user(args.user)
|
||||
if profile:
|
||||
save_deep_profile(db, profile)
|
||||
print(f"\n=== {profile['username']} ===")
|
||||
print(f"real name: {profile.get('real_name')}")
|
||||
print(f"location: {profile.get('location')}")
|
||||
print(f"company: {profile.get('company')}")
|
||||
print(f"email: {profile.get('email')}")
|
||||
print(f"twitter: {profile.get('twitter')}")
|
||||
print(f"mastodon: {profile.get('mastodon')}")
|
||||
print(f"orgs: {', '.join(profile.get('orgs', []))}")
|
||||
print(f"languages: {', '.join(list(profile.get('languages', {}).keys())[:5])}")
|
||||
print(f"topics: {', '.join(profile.get('topics', [])[:10])}")
|
||||
print(f"signals: {', '.join(profile.get('signals', []))}")
|
||||
print(f"score: {profile.get('score')}")
|
||||
if profile.get('linked_profiles'):
|
||||
print(f"linked profiles: {list(profile['linked_profiles'].keys())}")
|
||||
else:
|
||||
print("failed to scrape user")
|
||||
return
|
||||
|
||||
run_all = not any([args.github, args.reddit, args.mastodon, args.lobsters, args.matrix, args.twitter, args.bluesky, args.lemmy, args.discord])
|
||||
|
||||
if args.github or run_all:
|
||||
if args.deep:
|
||||
# deep scrape mode - slower but more thorough
|
||||
print("\nrunning DEEP github scrape (follows all links)...")
|
||||
from scoutd.github import get_repo_contributors
|
||||
from scoutd.signals import ECOSYSTEM_REPOS
|
||||
|
||||
all_logins = set()
|
||||
for repo in ECOSYSTEM_REPOS[:5]: # limit for deep mode
|
||||
contributors = get_repo_contributors(repo, per_page=20)
|
||||
for c in contributors:
|
||||
login = c.get('login')
|
||||
if login and not login.endswith('[bot]'):
|
||||
all_logins.add(login)
|
||||
print(f" {repo}: {len(contributors)} contributors")
|
||||
|
||||
print(f"\ndeep scraping {len(all_logins)} users...")
|
||||
for login in all_logins:
|
||||
try:
|
||||
profile = deep_scrape_github_user(login)
|
||||
if profile and profile.get('score', 0) > 0:
|
||||
save_deep_profile(db, profile)
|
||||
if profile['score'] >= 30:
|
||||
print(f" ★ {login}: {profile['score']} pts")
|
||||
if profile.get('email'):
|
||||
print(f" email: {profile['email']}")
|
||||
if profile.get('mastodon'):
|
||||
print(f" mastodon: {profile['mastodon']}")
|
||||
except Exception as e:
|
||||
print(f" error on {login}: {e}")
|
||||
else:
|
||||
scrape_github(db)
|
||||
|
||||
if args.reddit or run_all:
|
||||
scrape_reddit(db)
|
||||
|
||||
if args.mastodon or run_all:
|
||||
scrape_mastodon(db)
|
||||
|
||||
if args.lobsters or run_all:
|
||||
scrape_lobsters(db)
|
||||
|
||||
if args.matrix or run_all:
|
||||
scrape_matrix(db)
|
||||
|
||||
if args.twitter or run_all:
|
||||
from scoutd.twitter import scrape_twitter
|
||||
scrape_twitter(db)
|
||||
|
||||
if args.bluesky or run_all:
|
||||
from scoutd.bluesky import scrape_bluesky
|
||||
scrape_bluesky(db)
|
||||
|
||||
if args.lemmy or run_all:
|
||||
from scoutd.lemmy import scrape_lemmy
|
||||
scrape_lemmy(db)
|
||||
|
||||
if args.discord or run_all:
|
||||
from scoutd.discord import scrape_discord
|
||||
scrape_discord(db)
|
||||
|
||||
# show stats
|
||||
stats = db.stats()
|
||||
print("\n" + "=" * 60)
|
||||
print("SCOUT COMPLETE")
|
||||
print("=" * 60)
|
||||
print(f"total humans: {stats['total_humans']}")
|
||||
for platform, count in stats.get('by_platform', {}).items():
|
||||
print(f" {platform}: {count}")
|
||||
|
||||
# show lost builder stats if requested
|
||||
if args.lost or True: # always show lost stats now
|
||||
print("\n--- lost builder stats ---")
|
||||
print(f"active builders: {stats.get('active_builders', 0)}")
|
||||
print(f"lost builders: {stats.get('lost_builders', 0)}")
|
||||
print(f"recovering builders: {stats.get('recovering_builders', 0)}")
|
||||
print(f"high lost score (40+): {stats.get('high_lost_score', 0)}")
|
||||
print(f"lost outreach sent: {stats.get('lost_outreach_sent', 0)}")
|
||||
|
||||
|
||||
def cmd_match(args, db):
|
||||
"""find and rank matches"""
|
||||
import json as json_mod
|
||||
|
||||
print("=" * 60)
|
||||
print("connectd match - finding aligned pairs")
|
||||
print("=" * 60)
|
||||
|
||||
# lost builder matching
|
||||
if args.lost:
|
||||
print("\n--- LOST BUILDER MATCHING ---")
|
||||
print("finding inspiring builders for lost souls...\n")
|
||||
|
||||
matches, error = find_matches_for_lost_builders(db, limit=args.top or 20)
|
||||
|
||||
if error:
|
||||
print(f"error: {error}")
|
||||
return
|
||||
|
||||
if not matches:
|
||||
print("no lost builders ready for outreach")
|
||||
return
|
||||
|
||||
print(f"found {len(matches)} lost builders with matching active builders\n")
|
||||
|
||||
for i, match in enumerate(matches, 1):
|
||||
lost = match['lost_user']
|
||||
builder = match['inspiring_builder']
|
||||
|
||||
lost_name = lost.get('name') or lost.get('username')
|
||||
builder_name = builder.get('name') or builder.get('username')
|
||||
|
||||
print(f"{i}. {lost_name} ({lost.get('platform')}) → needs inspiration from")
|
||||
print(f" {builder_name} ({builder.get('platform')})")
|
||||
print(f" lost score: {lost.get('lost_potential_score', 0)} | values: {lost.get('score', 0)}")
|
||||
print(f" shared interests: {', '.join(match.get('shared_interests', []))}")
|
||||
print(f" builder has: {match.get('builder_repos', 0)} repos, {match.get('builder_stars', 0)} stars")
|
||||
print()
|
||||
|
||||
return
|
||||
|
||||
if args.mine:
|
||||
# show matches for priority user
|
||||
init_users_table(db.conn)
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
print("no priority user configured. run: connectd user --setup")
|
||||
return
|
||||
|
||||
for user in users:
|
||||
print(f"\n=== matches for {user['name']} ===\n")
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=args.top or 20)
|
||||
|
||||
if not matches:
|
||||
print("no matches yet - run: connectd scout && connectd match")
|
||||
continue
|
||||
|
||||
for i, match in enumerate(matches, 1):
|
||||
print(f"{i}. {match['username']} ({match['platform']})")
|
||||
print(f" score: {match['overlap_score']:.0f}")
|
||||
print(f" url: {match['url']}")
|
||||
reasons = match.get('overlap_reasons', '[]')
|
||||
if isinstance(reasons, str):
|
||||
reasons = json_mod.loads(reasons)
|
||||
if reasons:
|
||||
print(f" why: {reasons[0]}")
|
||||
print()
|
||||
return
|
||||
|
||||
if args.top and not args.mine:
|
||||
# just show existing top matches
|
||||
matches = get_top_matches(db, limit=args.top)
|
||||
else:
|
||||
# run full matching
|
||||
matches = find_all_matches(db, min_score=args.min_score, min_overlap=args.min_overlap)
|
||||
|
||||
print("\n" + "-" * 60)
|
||||
print(f"TOP {min(len(matches), args.top or 20)} MATCHES")
|
||||
print("-" * 60)
|
||||
|
||||
for i, match in enumerate(matches[:args.top or 20], 1):
|
||||
human_a = match.get('human_a', {})
|
||||
human_b = match.get('human_b', {})
|
||||
|
||||
print(f"\n{i}. {human_a.get('username')} <-> {human_b.get('username')}")
|
||||
print(f" platforms: {human_a.get('platform')} / {human_b.get('platform')}")
|
||||
print(f" overlap: {match.get('overlap_score', 0):.0f} pts")
|
||||
|
||||
reasons = match.get('overlap_reasons', [])
|
||||
if isinstance(reasons, str):
|
||||
reasons = json_mod.loads(reasons)
|
||||
if reasons:
|
||||
print(f" why: {' | '.join(reasons[:3])}")
|
||||
|
||||
if match.get('geographic_match'):
|
||||
print(f" location: compatible ✓")
|
||||
|
||||
|
||||
def cmd_intro(args, db):
|
||||
"""generate intro drafts"""
|
||||
import json as json_mod
|
||||
|
||||
print("=" * 60)
|
||||
print("connectd intro - drafting introductions")
|
||||
print("=" * 60)
|
||||
|
||||
if args.dry_run:
|
||||
print("*** DRY RUN MODE - previewing only ***\n")
|
||||
|
||||
# lost builder intros - different tone entirely
|
||||
if args.lost:
|
||||
print("\n--- LOST BUILDER INTROS ---")
|
||||
print("drafting encouragement for lost souls...\n")
|
||||
|
||||
matches, error = find_matches_for_lost_builders(db, limit=args.limit or 10)
|
||||
|
||||
if error:
|
||||
print(f"error: {error}")
|
||||
return
|
||||
|
||||
if not matches:
|
||||
print("no lost builders ready for outreach")
|
||||
return
|
||||
|
||||
config = get_lost_intro_config()
|
||||
count = 0
|
||||
|
||||
for match in matches:
|
||||
lost = match['lost_user']
|
||||
builder = match['inspiring_builder']
|
||||
|
||||
lost_name = lost.get('name') or lost.get('username')
|
||||
builder_name = builder.get('name') or builder.get('username')
|
||||
|
||||
# draft intro
|
||||
draft, error = draft_lost_intro(lost, builder, config)
|
||||
|
||||
if error:
|
||||
print(f" error drafting intro for {lost_name}: {error}")
|
||||
continue
|
||||
|
||||
if args.dry_run:
|
||||
print("=" * 60)
|
||||
print(f"TO: {lost_name} ({lost.get('platform')})")
|
||||
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
|
||||
print(f"INSPIRING: {builder_name} ({builder.get('url')})")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(draft)
|
||||
print("-" * 60)
|
||||
print("[DRY RUN - NOT SAVED]")
|
||||
print("=" * 60)
|
||||
else:
|
||||
print(f" drafted intro for {lost_name} → {builder_name}")
|
||||
|
||||
count += 1
|
||||
|
||||
if args.dry_run:
|
||||
print(f"\npreviewed {count} lost builder intros (dry run)")
|
||||
else:
|
||||
print(f"\ndrafted {count} lost builder intros")
|
||||
print("these require manual review before sending")
|
||||
|
||||
return
|
||||
|
||||
if args.match:
|
||||
# specific match
|
||||
matches = [m for m in get_top_matches(db, limit=1000) if m.get('id') == args.match]
|
||||
else:
|
||||
# top matches
|
||||
matches = get_top_matches(db, limit=args.limit or 10)
|
||||
|
||||
if not matches:
|
||||
print("no matches found")
|
||||
return
|
||||
|
||||
print(f"generating intros for {len(matches)} matches...")
|
||||
|
||||
count = 0
|
||||
for match in matches:
|
||||
intros = draft_intros_for_match(match)
|
||||
|
||||
for intro in intros:
|
||||
recipient = intro['recipient_human']
|
||||
other = intro['other_human']
|
||||
|
||||
if args.dry_run:
|
||||
# get contact info
|
||||
contact = recipient.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
contact = json_mod.loads(contact)
|
||||
email = contact.get('email', 'no email')
|
||||
|
||||
# get overlap reasons
|
||||
reasons = match.get('overlap_reasons', [])
|
||||
if isinstance(reasons, str):
|
||||
reasons = json_mod.loads(reasons)
|
||||
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
|
||||
|
||||
# print preview
|
||||
print("\n" + "=" * 60)
|
||||
print(f"TO: {recipient.get('username')} ({recipient.get('platform')})")
|
||||
print(f"EMAIL: {email}")
|
||||
print(f"SUBJECT: you might want to meet {other.get('username')}")
|
||||
print(f"SCORE: {match.get('overlap_score', 0):.0f} ({reason_summary})")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(intro['draft'])
|
||||
print("-" * 60)
|
||||
print("[DRY RUN - NOT SENT]")
|
||||
print("=" * 60)
|
||||
else:
|
||||
print(f"\n {recipient.get('username')} ({intro['channel']})")
|
||||
|
||||
# save to db
|
||||
db.save_intro(
|
||||
match.get('id'),
|
||||
recipient.get('id'),
|
||||
intro['channel'],
|
||||
intro['draft']
|
||||
)
|
||||
|
||||
count += 1
|
||||
|
||||
if args.dry_run:
|
||||
print(f"\npreviewed {count} intros (dry run - nothing saved)")
|
||||
else:
|
||||
print(f"\ngenerated {count} intro drafts")
|
||||
print("run 'connectd review' to approve before sending")
|
||||
|
||||
|
||||
def cmd_review(args, db):
|
||||
"""interactive review queue"""
|
||||
review_all_pending(db)
|
||||
|
||||
|
||||
def cmd_send(args, db):
|
||||
"""send approved intros"""
|
||||
import json as json_mod
|
||||
|
||||
if args.export:
|
||||
# export manual queue to file for review
|
||||
queue = load_manual_queue()
|
||||
pending = [q for q in queue if q.get('status') == 'pending']
|
||||
|
||||
with open(args.export, 'w') as f:
|
||||
json.dump(pending, f, indent=2)
|
||||
|
||||
print(f"exported {len(pending)} pending intros to {args.export}")
|
||||
return
|
||||
|
||||
# send all approved from manual queue
|
||||
queue = load_manual_queue()
|
||||
approved = [q for q in queue if q.get('status') == 'approved']
|
||||
|
||||
if not approved:
|
||||
print("no approved intros to send")
|
||||
print("use 'connectd review' to approve intros first")
|
||||
return
|
||||
|
||||
print(f"sending {len(approved)} approved intros...")
|
||||
|
||||
for item in approved:
|
||||
match_data = item.get('match', {})
|
||||
intro_draft = item.get('draft', '')
|
||||
recipient = item.get('recipient', {})
|
||||
|
||||
success, error, method = deliver_intro(
|
||||
{'human_b': recipient, **match_data},
|
||||
intro_draft,
|
||||
dry_run=args.dry_run if hasattr(args, 'dry_run') else False
|
||||
)
|
||||
|
||||
status = 'ok' if success else f'failed: {error}'
|
||||
print(f" {recipient.get('username')}: {method} - {status}")
|
||||
|
||||
# update queue status
|
||||
item['status'] = 'sent' if success else 'failed'
|
||||
item['error'] = error
|
||||
|
||||
save_manual_queue(queue)
|
||||
|
||||
# show stats
|
||||
stats = get_delivery_stats()
|
||||
print(f"\ndelivery stats: {stats['sent']} sent, {stats['failed']} failed")
|
||||
|
||||
|
||||
def cmd_lost(args, db):
|
||||
"""show lost builders ready for outreach"""
|
||||
import json as json_mod
|
||||
|
||||
print("=" * 60)
|
||||
print("connectd lost - lost builders who need encouragement")
|
||||
print("=" * 60)
|
||||
|
||||
# get lost builders
|
||||
lost_builders = db.get_lost_builders_for_outreach(
|
||||
min_lost_score=args.min_score or 40,
|
||||
min_values_score=20,
|
||||
limit=args.limit or 50
|
||||
)
|
||||
|
||||
if not lost_builders:
|
||||
print("\nno lost builders ready for outreach")
|
||||
print("run 'connectd scout' to discover more")
|
||||
return
|
||||
|
||||
print(f"\n{len(lost_builders)} lost builders ready for outreach:\n")
|
||||
|
||||
for i, lost in enumerate(lost_builders, 1):
|
||||
name = lost.get('name') or lost.get('username')
|
||||
platform = lost.get('platform')
|
||||
lost_score = lost.get('lost_potential_score', 0)
|
||||
values_score = lost.get('score', 0)
|
||||
|
||||
# parse lost signals
|
||||
lost_signals = lost.get('lost_signals', [])
|
||||
if isinstance(lost_signals, str):
|
||||
lost_signals = json_mod.loads(lost_signals) if lost_signals else []
|
||||
|
||||
# get signal descriptions
|
||||
signal_descriptions = get_signal_descriptions(lost_signals)
|
||||
|
||||
print(f"{i}. {name} ({platform})")
|
||||
print(f" lost score: {lost_score} | values score: {values_score}")
|
||||
print(f" url: {lost.get('url')}")
|
||||
if signal_descriptions:
|
||||
print(f" why lost: {', '.join(signal_descriptions[:3])}")
|
||||
print()
|
||||
|
||||
if args.verbose:
|
||||
print("-" * 60)
|
||||
print("these people need encouragement, not networking.")
|
||||
print("the goal: show them someone like them made it.")
|
||||
print("-" * 60)
|
||||
|
||||
|
||||
def cmd_status(args, db):
|
||||
"""show database stats"""
|
||||
import json as json_mod
|
||||
|
||||
init_users_table(db.conn)
|
||||
stats = db.stats()
|
||||
|
||||
print("=" * 60)
|
||||
print("connectd status")
|
||||
print("=" * 60)
|
||||
|
||||
# priority users
|
||||
users = get_priority_users(db.conn)
|
||||
print(f"\npriority users: {len(users)}")
|
||||
for user in users:
|
||||
print(f" - {user['name']} ({user['email']})")
|
||||
|
||||
print(f"\nhumans discovered: {stats['total_humans']}")
|
||||
print(f" high-score (50+): {stats['high_score_humans']}")
|
||||
|
||||
print("\nby platform:")
|
||||
for platform, count in stats.get('by_platform', {}).items():
|
||||
print(f" {platform}: {count}")
|
||||
|
||||
print(f"\nstranger matches: {stats['total_matches']}")
|
||||
print(f"intros created: {stats['total_intros']}")
|
||||
print(f"intros sent: {stats['sent_intros']}")
|
||||
|
||||
# lost builder stats
|
||||
print("\n--- lost builder stats ---")
|
||||
print(f"active builders: {stats.get('active_builders', 0)}")
|
||||
print(f"lost builders: {stats.get('lost_builders', 0)}")
|
||||
print(f"recovering builders: {stats.get('recovering_builders', 0)}")
|
||||
print(f"high lost score (40+): {stats.get('high_lost_score', 0)}")
|
||||
print(f"lost outreach sent: {stats.get('lost_outreach_sent', 0)}")
|
||||
|
||||
# priority user matches
|
||||
for user in users:
|
||||
matches = get_priority_user_matches(db.conn, user['id'])
|
||||
print(f"\nmatches for {user['name']}: {len(matches)}")
|
||||
|
||||
# pending intros
|
||||
pending = get_pending_intros(db)
|
||||
print(f"\nintros pending review: {len(pending)}")
|
||||
|
||||
|
||||
def cmd_daemon(args, db):
|
||||
"""run as continuous daemon"""
|
||||
from daemon import ConnectDaemon
|
||||
|
||||
daemon = ConnectDaemon(dry_run=args.dry_run)
|
||||
|
||||
if args.oneshot:
|
||||
print("running one cycle...")
|
||||
if args.dry_run:
|
||||
print("*** DRY RUN MODE - no intros will be sent ***")
|
||||
daemon.scout_cycle()
|
||||
daemon.match_priority_users()
|
||||
daemon.match_strangers()
|
||||
daemon.send_stranger_intros()
|
||||
print("done")
|
||||
else:
|
||||
daemon.run()
|
||||
|
||||
|
||||
def cmd_user(args, db):
|
||||
"""manage priority user profile"""
|
||||
import json as json_mod
|
||||
|
||||
init_users_table(db.conn)
|
||||
|
||||
if args.setup:
|
||||
# interactive setup
|
||||
print("=" * 60)
|
||||
print("connectd priority user setup")
|
||||
print("=" * 60)
|
||||
print("\nlink your profiles so connectd finds matches for YOU\n")
|
||||
|
||||
name = input("name: ").strip()
|
||||
email = input("email: ").strip()
|
||||
github = input("github username: ").strip() or None
|
||||
reddit = input("reddit username: ").strip() or None
|
||||
mastodon = input("mastodon (user@instance): ").strip() or None
|
||||
location = input("location (e.g. seattle): ").strip() or None
|
||||
|
||||
print("\ninterests (comma separated):")
|
||||
interests_raw = input("> ").strip()
|
||||
interests = [i.strip() for i in interests_raw.split(',')] if interests_raw else []
|
||||
|
||||
looking_for = input("looking for: ").strip() or None
|
||||
|
||||
user_data = {
|
||||
'name': name, 'email': email, 'github': github,
|
||||
'reddit': reddit, 'mastodon': mastodon,
|
||||
'location': location, 'interests': interests,
|
||||
'looking_for': looking_for,
|
||||
}
|
||||
user_id = add_priority_user(db.conn, user_data)
|
||||
print(f"\n✓ added as priority user #{user_id}")
|
||||
|
||||
elif args.matches:
|
||||
# show matches
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
print("no priority user. run: connectd user --setup")
|
||||
return
|
||||
|
||||
for user in users:
|
||||
print(f"\n=== matches for {user['name']} ===\n")
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=20)
|
||||
|
||||
if not matches:
|
||||
print("no matches yet")
|
||||
continue
|
||||
|
||||
for i, match in enumerate(matches, 1):
|
||||
print(f"{i}. {match['username']} ({match['platform']})")
|
||||
print(f" {match['url']}")
|
||||
print(f" score: {match['overlap_score']:.0f}")
|
||||
print()
|
||||
|
||||
else:
|
||||
# show profile
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
print("no priority user configured")
|
||||
print("run: connectd user --setup")
|
||||
return
|
||||
|
||||
for user in users:
|
||||
print("=" * 60)
|
||||
print(f"priority user #{user['id']}: {user['name']}")
|
||||
print("=" * 60)
|
||||
print(f"email: {user['email']}")
|
||||
if user['github']:
|
||||
print(f"github: {user['github']}")
|
||||
if user['reddit']:
|
||||
print(f"reddit: {user['reddit']}")
|
||||
if user['mastodon']:
|
||||
print(f"mastodon: {user['mastodon']}")
|
||||
if user['location']:
|
||||
print(f"location: {user['location']}")
|
||||
if user['interests']:
|
||||
interests = json_mod.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
|
||||
print(f"interests: {', '.join(interests)}")
|
||||
if user['looking_for']:
|
||||
print(f"looking for: {user['looking_for']}")
|
||||
|
||||
|
||||
def cmd_me(args, db):
|
||||
"""auto-score and auto-match for priority user with optional groq intros"""
|
||||
import json as json_mod
|
||||
|
||||
init_users_table(db.conn)
|
||||
|
||||
# get priority user
|
||||
users = get_priority_users(db.conn)
|
||||
if not users:
|
||||
print("no priority user configured")
|
||||
print("run: connectd user --setup")
|
||||
return
|
||||
|
||||
user = users[0] # first/main user
|
||||
print("=" * 60)
|
||||
print(f"connectd me - {user['name']}")
|
||||
print("=" * 60)
|
||||
|
||||
# step 1: scrape github profile
|
||||
if user.get('github') and not args.skip_scrape:
|
||||
print(f"\n[1/4] scraping github profile: {user['github']}")
|
||||
profile = deep_scrape_github_user(user['github'], scrape_commits=False)
|
||||
if profile:
|
||||
print(f" repos: {len(profile.get('top_repos', []))}")
|
||||
print(f" languages: {', '.join(list(profile.get('languages', {}).keys())[:5])}")
|
||||
else:
|
||||
print(" failed to scrape (rate limited?)")
|
||||
profile = None
|
||||
else:
|
||||
print("\n[1/4] skipping github scrape (using saved profile)")
|
||||
# use saved profile if available
|
||||
saved = user.get('scraped_profile')
|
||||
if saved:
|
||||
profile = json_mod.loads(saved) if isinstance(saved, str) else saved
|
||||
print(f" loaded saved profile: {len(profile.get('top_repos', []))} repos")
|
||||
else:
|
||||
profile = None
|
||||
|
||||
# step 2: calculate score
|
||||
print(f"\n[2/4] calculating your score...")
|
||||
result = score_priority_user(db.conn, user['id'], profile)
|
||||
if result:
|
||||
print(f" score: {result['score']}")
|
||||
print(f" signals: {', '.join(sorted(result['signals'])[:10])}")
|
||||
|
||||
# step 3: find matches
|
||||
print(f"\n[3/4] finding matches...")
|
||||
matches = auto_match_priority_user(db.conn, user['id'], min_overlap=args.min_overlap)
|
||||
print(f" found {len(matches)} matches")
|
||||
|
||||
# step 4: show results (optionally with groq intros)
|
||||
print(f"\n[4/4] top matches:")
|
||||
print("-" * 60)
|
||||
|
||||
limit = args.limit or 10
|
||||
for i, m in enumerate(matches[:limit], 1):
|
||||
human = m['human']
|
||||
shared = m['shared']
|
||||
|
||||
print(f"\n{i}. {human.get('name') or human['username']} ({human['platform']})")
|
||||
print(f" {human.get('url', '')}")
|
||||
print(f" score: {human.get('score', 0):.0f} | overlap: {m['overlap_score']:.0f}")
|
||||
print(f" location: {human.get('location') or 'unknown'}")
|
||||
print(f" why: {', '.join(shared[:5])}")
|
||||
|
||||
# groq intro draft
|
||||
if args.groq:
|
||||
try:
|
||||
from introd.groq_draft import draft_intro_with_llm
|
||||
match_data = {
|
||||
'human_a': {'name': user['name'], 'username': user.get('github'),
|
||||
'platform': 'github', 'signals': result.get('signals', []) if result else [],
|
||||
'bio': user.get('bio'), 'location': user.get('location'),
|
||||
'extra': profile or {}},
|
||||
'human_b': human,
|
||||
'overlap_score': m['overlap_score'],
|
||||
'overlap_reasons': shared,
|
||||
}
|
||||
intro, err = draft_intro_with_llm(match_data, recipient='b')
|
||||
if intro:
|
||||
print(f"\n --- groq draft ({intro.get('contact_method', 'manual')}) ---")
|
||||
if intro.get('contact_info'):
|
||||
print(f" deliver via: {intro['contact_info']}")
|
||||
for line in intro['draft'].split('\n'):
|
||||
print(f" {line}")
|
||||
print(f" ------------------")
|
||||
elif err:
|
||||
print(f" [groq error: {err}]")
|
||||
except Exception as e:
|
||||
print(f" [groq error: {e}]")
|
||||
|
||||
# summary
|
||||
print("\n" + "=" * 60)
|
||||
print(f"your score: {result['score'] if result else 'unknown'}")
|
||||
print(f"matches found: {len(matches)}")
|
||||
if args.groq:
|
||||
print("groq intros: enabled")
|
||||
else:
|
||||
print("tip: add --groq to generate ai intro drafts")
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description='connectd - people discovery and matchmaking daemon',
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog=__doc__
|
||||
)
|
||||
|
||||
subparsers = parser.add_subparsers(dest='command', help='commands')
|
||||
|
||||
# scout command
|
||||
scout_parser = subparsers.add_parser('scout', help='discover aligned humans')
|
||||
scout_parser.add_argument('--github', action='store_true', help='github only')
|
||||
scout_parser.add_argument('--reddit', action='store_true', help='reddit only')
|
||||
scout_parser.add_argument('--mastodon', action='store_true', help='mastodon only')
|
||||
scout_parser.add_argument('--lobsters', action='store_true', help='lobste.rs only')
|
||||
scout_parser.add_argument('--matrix', action='store_true', help='matrix only')
|
||||
scout_parser.add_argument('--twitter', action='store_true', help='twitter/x via nitter')
|
||||
scout_parser.add_argument('--bluesky', action='store_true', help='bluesky/atproto')
|
||||
scout_parser.add_argument('--lemmy', action='store_true', help='lemmy (fediverse reddit)')
|
||||
scout_parser.add_argument('--discord', action='store_true', help='discord servers')
|
||||
scout_parser.add_argument('--deep', action='store_true', help='deep scrape - follow all links')
|
||||
scout_parser.add_argument('--user', type=str, help='deep scrape specific github user')
|
||||
scout_parser.add_argument('--lost', action='store_true', help='show lost builder stats')
|
||||
|
||||
# match command
|
||||
match_parser = subparsers.add_parser('match', help='find and rank matches')
|
||||
match_parser.add_argument('--top', type=int, help='show top N matches')
|
||||
match_parser.add_argument('--mine', action='store_true', help='show YOUR matches')
|
||||
match_parser.add_argument('--lost', action='store_true', help='find matches for lost builders')
|
||||
match_parser.add_argument('--min-score', type=int, default=30, help='min human score')
|
||||
match_parser.add_argument('--min-overlap', type=int, default=20, help='min overlap score')
|
||||
|
||||
# intro command
|
||||
intro_parser = subparsers.add_parser('intro', help='generate intro drafts')
|
||||
intro_parser.add_argument('--match', type=int, help='specific match id')
|
||||
intro_parser.add_argument('--limit', type=int, default=10, help='number of matches')
|
||||
intro_parser.add_argument('--dry-run', action='store_true', help='preview only, do not save')
|
||||
intro_parser.add_argument('--lost', action='store_true', help='generate intros for lost builders')
|
||||
|
||||
# lost command - show lost builders ready for outreach
|
||||
lost_parser = subparsers.add_parser('lost', help='show lost builders who need encouragement')
|
||||
lost_parser.add_argument('--min-score', type=int, default=40, help='min lost score')
|
||||
lost_parser.add_argument('--limit', type=int, default=50, help='max results')
|
||||
lost_parser.add_argument('--verbose', '-v', action='store_true', help='show philosophy')
|
||||
|
||||
# review command
|
||||
review_parser = subparsers.add_parser('review', help='review intro queue')
|
||||
|
||||
# send command
|
||||
send_parser = subparsers.add_parser('send', help='send approved intros')
|
||||
send_parser.add_argument('--export', type=str, help='export to file for manual sending')
|
||||
|
||||
# status command
|
||||
status_parser = subparsers.add_parser('status', help='show stats')
|
||||
|
||||
# daemon command
|
||||
daemon_parser = subparsers.add_parser('daemon', help='run as continuous daemon')
|
||||
daemon_parser.add_argument('--oneshot', action='store_true', help='run once then exit')
|
||||
daemon_parser.add_argument('--dry-run', action='store_true', help='preview intros, do not send')
|
||||
|
||||
# user command
|
||||
user_parser = subparsers.add_parser('user', help='manage priority user profile')
|
||||
user_parser.add_argument('--setup', action='store_true', help='setup/update profile')
|
||||
user_parser.add_argument('--matches', action='store_true', help='show your matches')
|
||||
|
||||
# me command - auto score + match + optional groq intros
|
||||
me_parser = subparsers.add_parser('me', help='auto-score and match yourself')
|
||||
me_parser.add_argument('--groq', action='store_true', help='generate groq llama intro drafts')
|
||||
me_parser.add_argument('--skip-scrape', action='store_true', help='skip github scraping')
|
||||
me_parser.add_argument('--min-overlap', type=int, default=40, help='min overlap score')
|
||||
me_parser.add_argument('--limit', type=int, default=10, help='number of matches to show')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.command:
|
||||
parser.print_help()
|
||||
return
|
||||
|
||||
# init database
|
||||
db = Database()
|
||||
|
||||
try:
|
||||
if args.command == 'scout':
|
||||
cmd_scout(args, db)
|
||||
elif args.command == 'match':
|
||||
cmd_match(args, db)
|
||||
elif args.command == 'intro':
|
||||
cmd_intro(args, db)
|
||||
elif args.command == 'review':
|
||||
cmd_review(args, db)
|
||||
elif args.command == 'send':
|
||||
cmd_send(args, db)
|
||||
elif args.command == 'status':
|
||||
cmd_status(args, db)
|
||||
elif args.command == 'daemon':
|
||||
cmd_daemon(args, db)
|
||||
elif args.command == 'user':
|
||||
cmd_user(args, db)
|
||||
elif args.command == 'me':
|
||||
cmd_me(args, db)
|
||||
elif args.command == 'lost':
|
||||
cmd_lost(args, db)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
|
@ -1,124 +0,0 @@
|
|||
"""
|
||||
connectd/config.py - central configuration
|
||||
|
||||
all configurable settings in one place.
|
||||
"""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
# base paths
|
||||
BASE_DIR = Path(__file__).parent
|
||||
DB_DIR = BASE_DIR / 'db'
|
||||
DATA_DIR = BASE_DIR / 'data'
|
||||
CACHE_DIR = DB_DIR / 'cache'
|
||||
|
||||
# ensure directories exist
|
||||
DATA_DIR.mkdir(exist_ok=True)
|
||||
CACHE_DIR.mkdir(exist_ok=True)
|
||||
|
||||
|
||||
# === DAEMON CONFIG ===
|
||||
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
|
||||
MATCH_INTERVAL = 3600 # check matches every hour
|
||||
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
|
||||
MAX_INTROS_PER_DAY = 20 # rate limit builder-to-builder outreach
|
||||
|
||||
|
||||
# === MATCHING CONFIG ===
|
||||
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
|
||||
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
|
||||
MIN_HUMAN_SCORE = 25 # min values score to be considered
|
||||
|
||||
|
||||
# === LOST BUILDER CONFIG ===
|
||||
# these people need encouragement, not networking.
|
||||
# the goal isn't to recruit them - it's to show them the door exists.
|
||||
|
||||
LOST_CONFIG = {
|
||||
# detection thresholds
|
||||
'min_lost_score': 40, # minimum lost_potential_score
|
||||
'min_values_score': 20, # must have SOME values alignment
|
||||
|
||||
# outreach settings
|
||||
'enabled': True,
|
||||
'max_per_day': 5, # lower volume, higher care
|
||||
'require_review': False, # fully autonomous
|
||||
'cooldown_days': 90, # don't spam struggling people
|
||||
|
||||
# matching settings
|
||||
'min_builder_score': 50, # inspiring builders must be active
|
||||
'min_match_overlap': 10, # must have SOME shared interests
|
||||
|
||||
# LLM drafting
|
||||
'use_llm': True,
|
||||
'llm_temperature': 0.7, # be genuine, not robotic
|
||||
|
||||
# message guidelines (for LLM prompt)
|
||||
'tone': 'genuine, not salesy',
|
||||
'max_words': 150, # they don't have energy for long messages
|
||||
'no_pressure': True, # never pushy
|
||||
'sign_off': '- connectd',
|
||||
}
|
||||
|
||||
|
||||
# === API CREDENTIALS ===
|
||||
# all credentials from environment variables - no defaults
|
||||
|
||||
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
|
||||
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
|
||||
GROQ_MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
|
||||
|
||||
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
|
||||
MASTODON_TOKEN = os.environ.get('MASTODON_TOKEN', '')
|
||||
MASTODON_INSTANCE = os.environ.get('MASTODON_INSTANCE', '')
|
||||
|
||||
BLUESKY_HANDLE = os.environ.get('BLUESKY_HANDLE', '')
|
||||
BLUESKY_APP_PASSWORD = os.environ.get('BLUESKY_APP_PASSWORD', '')
|
||||
|
||||
MATRIX_HOMESERVER = os.environ.get('MATRIX_HOMESERVER', '')
|
||||
MATRIX_USER_ID = os.environ.get('MATRIX_USER_ID', '')
|
||||
MATRIX_ACCESS_TOKEN = os.environ.get('MATRIX_ACCESS_TOKEN', '')
|
||||
|
||||
DISCORD_BOT_TOKEN = os.environ.get('DISCORD_BOT_TOKEN', '')
|
||||
DISCORD_TARGET_SERVERS = os.environ.get('DISCORD_TARGET_SERVERS', '')
|
||||
|
||||
# lemmy (for authenticated access to private instance)
|
||||
LEMMY_INSTANCE = os.environ.get('LEMMY_INSTANCE', '')
|
||||
LEMMY_USERNAME = os.environ.get('LEMMY_USERNAME', '')
|
||||
LEMMY_PASSWORD = os.environ.get('LEMMY_PASSWORD', '')
|
||||
|
||||
# email (for sending intros)
|
||||
SMTP_HOST = os.environ.get('SMTP_HOST', '')
|
||||
SMTP_PORT = int(os.environ.get('SMTP_PORT', '465'))
|
||||
SMTP_USER = os.environ.get('SMTP_USER', '')
|
||||
SMTP_PASS = os.environ.get('SMTP_PASS', '')
|
||||
|
||||
# === HOST USER CONFIG ===
|
||||
# the person running connectd - gets priority matching
|
||||
HOST_USER = os.environ.get('HOST_USER', '') # alias like sudoxnym
|
||||
HOST_NAME = os.environ.get('HOST_NAME', '')
|
||||
HOST_EMAIL = os.environ.get('HOST_EMAIL', '')
|
||||
HOST_GITHUB = os.environ.get('HOST_GITHUB', '')
|
||||
HOST_MASTODON = os.environ.get('HOST_MASTODON', '') # user@instance
|
||||
HOST_REDDIT = os.environ.get('HOST_REDDIT', '')
|
||||
HOST_LEMMY = os.environ.get('HOST_LEMMY', '') # user@instance
|
||||
HOST_LOBSTERS = os.environ.get('HOST_LOBSTERS', '')
|
||||
HOST_MATRIX = os.environ.get('HOST_MATRIX', '') # @user:server
|
||||
HOST_DISCORD = os.environ.get('HOST_DISCORD', '') # user id
|
||||
HOST_BLUESKY = os.environ.get('HOST_BLUESKY', '') # handle.bsky.social
|
||||
HOST_LOCATION = os.environ.get('HOST_LOCATION', '')
|
||||
HOST_INTERESTS = os.environ.get('HOST_INTERESTS', '') # comma separated
|
||||
HOST_LOOKING_FOR = os.environ.get('HOST_LOOKING_FOR', '')
|
||||
|
||||
|
||||
def get_lost_config():
|
||||
"""get lost builder configuration"""
|
||||
return LOST_CONFIG.copy()
|
||||
|
||||
|
||||
def update_lost_config(updates):
|
||||
"""update lost builder configuration"""
|
||||
global LOST_CONFIG
|
||||
LOST_CONFIG.update(updates)
|
||||
return LOST_CONFIG.copy()
|
||||
|
|
@ -1,72 +0,0 @@
|
|||
name: connectd
|
||||
version: "1.1.0"
|
||||
slug: connectd
|
||||
description: "find isolated builders with aligned values. auto-discover humans on github, mastodon, lemmy, discord, and more."
|
||||
url: "https://github.com/sudoxnym/connectd"
|
||||
arch:
|
||||
- amd64
|
||||
- aarch64
|
||||
- armv7
|
||||
startup: application
|
||||
boot: auto
|
||||
ports:
|
||||
8099/tcp: 8099
|
||||
ports_description:
|
||||
8099/tcp: "connectd API (for HACS integration)"
|
||||
map:
|
||||
- config:rw
|
||||
options:
|
||||
host_user: ""
|
||||
host_name: ""
|
||||
host_email: ""
|
||||
host_mastodon: ""
|
||||
host_reddit: ""
|
||||
host_lemmy: ""
|
||||
host_lobsters: ""
|
||||
host_matrix: ""
|
||||
host_discord: ""
|
||||
host_bluesky: ""
|
||||
host_location: ""
|
||||
host_interests: ""
|
||||
host_looking_for: ""
|
||||
github_token: ""
|
||||
groq_api_key: ""
|
||||
mastodon_token: ""
|
||||
mastodon_instance: ""
|
||||
discord_bot_token: ""
|
||||
discord_target_servers: ""
|
||||
lemmy_instance: ""
|
||||
lemmy_username: ""
|
||||
lemmy_password: ""
|
||||
smtp_host: ""
|
||||
smtp_port: 465
|
||||
smtp_user: ""
|
||||
smtp_pass: ""
|
||||
schema:
|
||||
host_user: str?
|
||||
host_name: str?
|
||||
host_email: email?
|
||||
host_mastodon: str?
|
||||
host_reddit: str?
|
||||
host_lemmy: str?
|
||||
host_lobsters: str?
|
||||
host_matrix: str?
|
||||
host_discord: str?
|
||||
host_bluesky: str?
|
||||
host_location: str?
|
||||
host_interests: str?
|
||||
host_looking_for: str?
|
||||
github_token: str?
|
||||
groq_api_key: str?
|
||||
mastodon_token: str?
|
||||
mastodon_instance: str?
|
||||
discord_bot_token: str?
|
||||
discord_target_servers: str?
|
||||
lemmy_instance: str?
|
||||
lemmy_username: str?
|
||||
lemmy_password: str?
|
||||
smtp_host: str?
|
||||
smtp_port: int?
|
||||
smtp_user: str?
|
||||
smtp_pass: str?
|
||||
image: sudoxreboot/connectd-addon-{arch}
|
||||
|
|
@ -1,546 +0,0 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
connectd daemon - continuous discovery and matchmaking
|
||||
|
||||
two modes of operation:
|
||||
1. priority matching: find matches FOR hosts who run connectd
|
||||
2. altruistic matching: connect strangers to each other
|
||||
|
||||
runs continuously, respects rate limits, sends intros automatically
|
||||
"""
|
||||
|
||||
import time
|
||||
import json
|
||||
import signal
|
||||
import sys
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
|
||||
from db import Database
|
||||
from db.users import (init_users_table, get_priority_users, save_priority_match,
|
||||
get_priority_user_matches, discover_host_user)
|
||||
from scoutd import scrape_github, scrape_reddit, scrape_mastodon, scrape_lobsters, scrape_lemmy, scrape_discord
|
||||
from config import HOST_USER
|
||||
from scoutd.github import analyze_github_user, get_github_user
|
||||
from scoutd.signals import analyze_text
|
||||
from matchd.fingerprint import generate_fingerprint, fingerprint_similarity
|
||||
from matchd.overlap import find_overlap
|
||||
from matchd.lost import find_matches_for_lost_builders
|
||||
from introd.draft import draft_intro, summarize_human, summarize_overlap
|
||||
from introd.lost_intro import draft_lost_intro, get_lost_intro_config
|
||||
from introd.send import send_email
|
||||
from introd.deliver import deliver_intro, determine_best_contact
|
||||
from config import get_lost_config
|
||||
from api import start_api_thread, update_daemon_state
|
||||
|
||||
# daemon config
|
||||
SCOUT_INTERVAL = 3600 * 4 # full scout every 4 hours
|
||||
MATCH_INTERVAL = 3600 # check matches every hour
|
||||
INTRO_INTERVAL = 3600 * 2 # send intros every 2 hours
|
||||
LOST_INTERVAL = 3600 * 6 # lost builder outreach every 6 hours (lower volume)
|
||||
MAX_INTROS_PER_DAY = 20 # rate limit outreach
|
||||
MIN_OVERLAP_PRIORITY = 30 # min score for priority user matches
|
||||
MIN_OVERLAP_STRANGERS = 50 # higher bar for stranger intros
|
||||
|
||||
|
||||
class ConnectDaemon:
|
||||
def __init__(self, dry_run=False):
|
||||
self.db = Database()
|
||||
init_users_table(self.db.conn)
|
||||
self.running = True
|
||||
self.dry_run = dry_run
|
||||
self.started_at = datetime.now()
|
||||
self.last_scout = None
|
||||
self.last_match = None
|
||||
self.last_intro = None
|
||||
self.last_lost = None
|
||||
self.intros_today = 0
|
||||
self.lost_intros_today = 0
|
||||
self.today = datetime.now().date()
|
||||
|
||||
# handle shutdown gracefully
|
||||
signal.signal(signal.SIGINT, self._shutdown)
|
||||
signal.signal(signal.SIGTERM, self._shutdown)
|
||||
|
||||
# auto-discover host user from env
|
||||
if HOST_USER:
|
||||
self.log(f"HOST_USER set: {HOST_USER}")
|
||||
discover_host_user(self.db.conn, HOST_USER)
|
||||
|
||||
# update API state
|
||||
self._update_api_state()
|
||||
|
||||
def _shutdown(self, signum, frame):
|
||||
print("\nconnectd: shutting down...")
|
||||
self.running = False
|
||||
self._update_api_state()
|
||||
|
||||
def _update_api_state(self):
|
||||
"""update API state for HA integration"""
|
||||
now = datetime.now()
|
||||
|
||||
# calculate countdowns - if no cycle has run, use started_at
|
||||
def secs_until(last, interval):
|
||||
base = last if last else self.started_at
|
||||
next_run = base + timedelta(seconds=interval)
|
||||
remaining = (next_run - now).total_seconds()
|
||||
return max(0, int(remaining))
|
||||
|
||||
update_daemon_state({
|
||||
'running': self.running,
|
||||
'dry_run': self.dry_run,
|
||||
'last_scout': self.last_scout.isoformat() if self.last_scout else None,
|
||||
'last_match': self.last_match.isoformat() if self.last_match else None,
|
||||
'last_intro': self.last_intro.isoformat() if self.last_intro else None,
|
||||
'last_lost': self.last_lost.isoformat() if self.last_lost else None,
|
||||
'intros_today': self.intros_today,
|
||||
'lost_intros_today': self.lost_intros_today,
|
||||
'started_at': self.started_at.isoformat(),
|
||||
'countdown_scout': secs_until(self.last_scout, SCOUT_INTERVAL),
|
||||
'countdown_match': secs_until(self.last_match, MATCH_INTERVAL),
|
||||
'countdown_intro': secs_until(self.last_intro, INTRO_INTERVAL),
|
||||
'countdown_lost': secs_until(self.last_lost, LOST_INTERVAL),
|
||||
})
|
||||
|
||||
def log(self, msg):
|
||||
"""timestamped log"""
|
||||
print(f"[{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}] {msg}")
|
||||
|
||||
def reset_daily_limits(self):
|
||||
"""reset daily intro count"""
|
||||
if datetime.now().date() != self.today:
|
||||
self.today = datetime.now().date()
|
||||
self.intros_today = 0
|
||||
self.lost_intros_today = 0
|
||||
self.log("reset daily intro limits")
|
||||
|
||||
def scout_cycle(self):
|
||||
"""run discovery on all platforms"""
|
||||
self.log("starting scout cycle...")
|
||||
|
||||
try:
|
||||
scrape_github(self.db, limit_per_source=30)
|
||||
except Exception as e:
|
||||
self.log(f"github scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_reddit(self.db, limit_per_sub=30)
|
||||
except Exception as e:
|
||||
self.log(f"reddit scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_mastodon(self.db, limit_per_instance=30)
|
||||
except Exception as e:
|
||||
self.log(f"mastodon scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_lobsters(self.db)
|
||||
except Exception as e:
|
||||
self.log(f"lobsters scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_lemmy(self.db, limit_per_community=30)
|
||||
except Exception as e:
|
||||
self.log(f"lemmy scout error: {e}")
|
||||
|
||||
try:
|
||||
scrape_discord(self.db, limit_per_channel=50)
|
||||
except Exception as e:
|
||||
self.log(f"discord scout error: {e}")
|
||||
|
||||
self.last_scout = datetime.now()
|
||||
stats = self.db.stats()
|
||||
self.log(f"scout complete: {stats['total_humans']} humans in db")
|
||||
|
||||
def match_priority_users(self):
|
||||
"""find matches for priority users (hosts)"""
|
||||
priority_users = get_priority_users(self.db.conn)
|
||||
|
||||
if not priority_users:
|
||||
return
|
||||
|
||||
self.log(f"matching for {len(priority_users)} priority users...")
|
||||
|
||||
humans = self.db.get_all_humans(min_score=20, limit=500)
|
||||
|
||||
for puser in priority_users:
|
||||
# build priority user's fingerprint from their linked profiles
|
||||
puser_signals = []
|
||||
puser_text = []
|
||||
|
||||
if puser.get('bio'):
|
||||
puser_text.append(puser['bio'])
|
||||
if puser.get('interests'):
|
||||
interests = json.loads(puser['interests']) if isinstance(puser['interests'], str) else puser['interests']
|
||||
puser_signals.extend(interests)
|
||||
if puser.get('looking_for'):
|
||||
puser_text.append(puser['looking_for'])
|
||||
|
||||
# analyze their linked github if available
|
||||
if puser.get('github'):
|
||||
gh_user = analyze_github_user(puser['github'])
|
||||
if gh_user:
|
||||
puser_signals.extend(gh_user.get('signals', []))
|
||||
|
||||
puser_fingerprint = {
|
||||
'values_vector': {},
|
||||
'skills': {},
|
||||
'interests': list(set(puser_signals)),
|
||||
'location_pref': 'pnw' if puser.get('location') and 'seattle' in puser['location'].lower() else None,
|
||||
}
|
||||
|
||||
# score text
|
||||
if puser_text:
|
||||
_, text_signals, _ = analyze_text(' '.join(puser_text))
|
||||
puser_signals.extend(text_signals)
|
||||
|
||||
# find matches
|
||||
matches_found = 0
|
||||
for human in humans:
|
||||
# skip if it's their own profile on another platform
|
||||
human_user = human.get('username', '').lower()
|
||||
if puser.get('github') and human_user == puser['github'].lower():
|
||||
continue
|
||||
if puser.get('reddit') and human_user == puser['reddit'].lower():
|
||||
continue
|
||||
if puser.get('mastodon') and human_user == puser['mastodon'].lower().split('@')[0]:
|
||||
continue
|
||||
|
||||
# calculate overlap
|
||||
human_signals = human.get('signals', [])
|
||||
if isinstance(human_signals, str):
|
||||
human_signals = json.loads(human_signals)
|
||||
|
||||
shared = set(puser_signals) & set(human_signals)
|
||||
overlap_score = len(shared) * 10
|
||||
|
||||
# location bonus
|
||||
if puser.get('location') and human.get('location'):
|
||||
if 'seattle' in human['location'].lower() or 'pnw' in human['location'].lower():
|
||||
overlap_score += 20
|
||||
|
||||
if overlap_score >= MIN_OVERLAP_PRIORITY:
|
||||
overlap_data = {
|
||||
'overlap_score': overlap_score,
|
||||
'overlap_reasons': [f"shared: {', '.join(list(shared)[:5])}"] if shared else [],
|
||||
}
|
||||
save_priority_match(self.db.conn, puser['id'], human['id'], overlap_data)
|
||||
matches_found += 1
|
||||
|
||||
if matches_found:
|
||||
self.log(f" found {matches_found} matches for {puser['name'] or puser['email']}")
|
||||
|
||||
def match_strangers(self):
|
||||
"""find matches between discovered humans (altruistic)"""
|
||||
self.log("matching strangers...")
|
||||
|
||||
humans = self.db.get_all_humans(min_score=40, limit=200)
|
||||
|
||||
if len(humans) < 2:
|
||||
return
|
||||
|
||||
# generate fingerprints
|
||||
fingerprints = {}
|
||||
for human in humans:
|
||||
fp = generate_fingerprint(human)
|
||||
fingerprints[human['id']] = fp
|
||||
|
||||
# find pairs
|
||||
matches_found = 0
|
||||
from itertools import combinations
|
||||
|
||||
for human_a, human_b in combinations(humans, 2):
|
||||
# skip same platform same user
|
||||
if human_a['platform'] == human_b['platform']:
|
||||
if human_a['username'] == human_b['username']:
|
||||
continue
|
||||
|
||||
fp_a = fingerprints.get(human_a['id'])
|
||||
fp_b = fingerprints.get(human_b['id'])
|
||||
|
||||
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
|
||||
|
||||
if overlap['overlap_score'] >= MIN_OVERLAP_STRANGERS:
|
||||
# save match
|
||||
self.db.save_match(human_a['id'], human_b['id'], overlap)
|
||||
matches_found += 1
|
||||
|
||||
if matches_found:
|
||||
self.log(f"found {matches_found} stranger matches")
|
||||
|
||||
self.last_match = datetime.now()
|
||||
|
||||
def send_stranger_intros(self):
|
||||
"""send intros to connect strangers (or preview in dry-run mode)"""
|
||||
self.reset_daily_limits()
|
||||
|
||||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
self.log("daily intro limit reached")
|
||||
return
|
||||
|
||||
# get unsent matches
|
||||
c = self.db.conn.cursor()
|
||||
c.execute('''SELECT m.*,
|
||||
ha.id as a_id, ha.username as a_user, ha.platform as a_platform,
|
||||
ha.name as a_name, ha.url as a_url, ha.contact as a_contact,
|
||||
ha.signals as a_signals, ha.extra as a_extra,
|
||||
hb.id as b_id, hb.username as b_user, hb.platform as b_platform,
|
||||
hb.name as b_name, hb.url as b_url, hb.contact as b_contact,
|
||||
hb.signals as b_signals, hb.extra as b_extra
|
||||
FROM matches m
|
||||
JOIN humans ha ON m.human_a_id = ha.id
|
||||
JOIN humans hb ON m.human_b_id = hb.id
|
||||
WHERE m.status = 'pending'
|
||||
ORDER BY m.overlap_score DESC
|
||||
LIMIT 10''')
|
||||
|
||||
matches = c.fetchall()
|
||||
|
||||
if self.dry_run:
|
||||
self.log(f"DRY RUN: previewing {len(matches)} potential intros")
|
||||
|
||||
for match in matches:
|
||||
if not self.dry_run and self.intros_today >= MAX_INTROS_PER_DAY:
|
||||
break
|
||||
|
||||
match = dict(match)
|
||||
|
||||
# build human dicts
|
||||
human_a = {
|
||||
'id': match['a_id'],
|
||||
'username': match['a_user'],
|
||||
'platform': match['a_platform'],
|
||||
'name': match['a_name'],
|
||||
'url': match['a_url'],
|
||||
'contact': match['a_contact'],
|
||||
'signals': match['a_signals'],
|
||||
'extra': match['a_extra'],
|
||||
}
|
||||
human_b = {
|
||||
'id': match['b_id'],
|
||||
'username': match['b_user'],
|
||||
'platform': match['b_platform'],
|
||||
'name': match['b_name'],
|
||||
'url': match['b_url'],
|
||||
'contact': match['b_contact'],
|
||||
'signals': match['b_signals'],
|
||||
'extra': match['b_extra'],
|
||||
}
|
||||
|
||||
match_data = {
|
||||
'id': match['id'],
|
||||
'human_a': human_a,
|
||||
'human_b': human_b,
|
||||
'overlap_score': match['overlap_score'],
|
||||
'overlap_reasons': match['overlap_reasons'],
|
||||
}
|
||||
|
||||
# try to send intro to person with email
|
||||
for recipient, other in [(human_a, human_b), (human_b, human_a)]:
|
||||
contact = recipient.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact)
|
||||
|
||||
email = contact.get('email')
|
||||
if not email:
|
||||
continue
|
||||
|
||||
# draft intro
|
||||
intro = draft_intro(match_data, recipient='a' if recipient == human_a else 'b')
|
||||
|
||||
# parse overlap reasons for display
|
||||
reasons = match['overlap_reasons']
|
||||
if isinstance(reasons, str):
|
||||
reasons = json.loads(reasons)
|
||||
reason_summary = ', '.join(reasons[:3]) if reasons else 'aligned values'
|
||||
|
||||
if self.dry_run:
|
||||
# print preview
|
||||
print("\n" + "=" * 60)
|
||||
print(f"TO: {recipient['username']} ({recipient['platform']})")
|
||||
print(f"EMAIL: {email}")
|
||||
print(f"SUBJECT: you might want to meet {other['username']}")
|
||||
print(f"SCORE: {match['overlap_score']:.0f} ({reason_summary})")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(intro['draft'])
|
||||
print("-" * 60)
|
||||
print("[DRY RUN - NOT SENT]")
|
||||
print("=" * 60)
|
||||
break
|
||||
else:
|
||||
# actually send
|
||||
success, error = send_email(
|
||||
email,
|
||||
f"connectd: you might want to meet {other['username']}",
|
||||
intro['draft']
|
||||
)
|
||||
|
||||
if success:
|
||||
self.log(f"sent intro to {recipient['username']} ({email})")
|
||||
self.intros_today += 1
|
||||
|
||||
# mark match as intro_sent
|
||||
c.execute('UPDATE matches SET status = "intro_sent" WHERE id = ?',
|
||||
(match['id'],))
|
||||
self.db.conn.commit()
|
||||
break
|
||||
else:
|
||||
self.log(f"failed to send to {email}: {error}")
|
||||
|
||||
self.last_intro = datetime.now()
|
||||
|
||||
def send_lost_builder_intros(self):
|
||||
"""
|
||||
reach out to lost builders - different tone, lower volume.
|
||||
these people need encouragement, not networking.
|
||||
"""
|
||||
self.reset_daily_limits()
|
||||
|
||||
lost_config = get_lost_config()
|
||||
|
||||
if not lost_config.get('enabled', True):
|
||||
return
|
||||
|
||||
max_per_day = lost_config.get('max_per_day', 5)
|
||||
if not self.dry_run and self.lost_intros_today >= max_per_day:
|
||||
self.log("daily lost builder intro limit reached")
|
||||
return
|
||||
|
||||
# find lost builders with matching active builders
|
||||
matches, error = find_matches_for_lost_builders(
|
||||
self.db,
|
||||
min_lost_score=lost_config.get('min_lost_score', 40),
|
||||
min_values_score=lost_config.get('min_values_score', 20),
|
||||
limit=max_per_day - self.lost_intros_today
|
||||
)
|
||||
|
||||
if error:
|
||||
self.log(f"lost builder matching error: {error}")
|
||||
return
|
||||
|
||||
if not matches:
|
||||
self.log("no lost builders ready for outreach")
|
||||
return
|
||||
|
||||
if self.dry_run:
|
||||
self.log(f"DRY RUN: previewing {len(matches)} lost builder intros")
|
||||
|
||||
for match in matches:
|
||||
if not self.dry_run and self.lost_intros_today >= max_per_day:
|
||||
break
|
||||
|
||||
lost = match['lost_user']
|
||||
builder = match['inspiring_builder']
|
||||
|
||||
lost_name = lost.get('name') or lost.get('username')
|
||||
builder_name = builder.get('name') or builder.get('username')
|
||||
|
||||
# draft intro
|
||||
draft, draft_error = draft_lost_intro(lost, builder, lost_config)
|
||||
|
||||
if draft_error:
|
||||
self.log(f"error drafting lost intro for {lost_name}: {draft_error}")
|
||||
continue
|
||||
|
||||
# determine best contact method (activity-based)
|
||||
method, contact_info = determine_best_contact(lost)
|
||||
|
||||
if self.dry_run:
|
||||
print("\n" + "=" * 60)
|
||||
print("LOST BUILDER OUTREACH")
|
||||
print("=" * 60)
|
||||
print(f"TO: {lost_name} ({lost.get('platform')})")
|
||||
print(f"DELIVERY: {method} → {contact_info}")
|
||||
print(f"LOST SCORE: {lost.get('lost_potential_score', 0)}")
|
||||
print(f"VALUES SCORE: {lost.get('score', 0)}")
|
||||
print(f"INSPIRING BUILDER: {builder_name}")
|
||||
print(f"SHARED INTERESTS: {', '.join(match.get('shared_interests', []))}")
|
||||
print("-" * 60)
|
||||
print("MESSAGE:")
|
||||
print(draft)
|
||||
print("-" * 60)
|
||||
print("[DRY RUN - NOT SENT]")
|
||||
print("=" * 60)
|
||||
else:
|
||||
# build match data for unified delivery
|
||||
match_data = {
|
||||
'human_a': builder, # inspiring builder
|
||||
'human_b': lost, # lost builder (recipient)
|
||||
'overlap_score': match.get('match_score', 0),
|
||||
'overlap_reasons': match.get('shared_interests', []),
|
||||
}
|
||||
|
||||
success, error, delivery_method = deliver_intro(match_data, draft)
|
||||
|
||||
if success:
|
||||
self.log(f"sent lost builder intro to {lost_name} via {delivery_method}")
|
||||
self.lost_intros_today += 1
|
||||
self.db.mark_lost_outreach(lost['id'])
|
||||
else:
|
||||
self.log(f"failed to reach {lost_name} via {delivery_method}: {error}")
|
||||
|
||||
self.last_lost = datetime.now()
|
||||
self.log(f"lost builder cycle complete: {self.lost_intros_today} sent today")
|
||||
|
||||
def run(self):
|
||||
"""main daemon loop"""
|
||||
self.log("connectd daemon starting...")
|
||||
|
||||
# start API server
|
||||
start_api_thread()
|
||||
self.log("api server started on port 8099")
|
||||
|
||||
if self.dry_run:
|
||||
self.log("*** DRY RUN MODE - no intros will be sent ***")
|
||||
self.log(f"scout interval: {SCOUT_INTERVAL}s")
|
||||
self.log(f"match interval: {MATCH_INTERVAL}s")
|
||||
self.log(f"intro interval: {INTRO_INTERVAL}s")
|
||||
self.log(f"lost interval: {LOST_INTERVAL}s")
|
||||
self.log(f"max intros/day: {MAX_INTROS_PER_DAY}")
|
||||
|
||||
# initial scout
|
||||
self.scout_cycle()
|
||||
self._update_api_state()
|
||||
|
||||
while self.running:
|
||||
now = datetime.now()
|
||||
|
||||
# scout cycle
|
||||
if not self.last_scout or (now - self.last_scout).seconds >= SCOUT_INTERVAL:
|
||||
self.scout_cycle()
|
||||
self._update_api_state()
|
||||
|
||||
# match cycle
|
||||
if not self.last_match or (now - self.last_match).seconds >= MATCH_INTERVAL:
|
||||
self.match_priority_users()
|
||||
self.match_strangers()
|
||||
self._update_api_state()
|
||||
|
||||
# intro cycle
|
||||
if not self.last_intro or (now - self.last_intro).seconds >= INTRO_INTERVAL:
|
||||
self.send_stranger_intros()
|
||||
self._update_api_state()
|
||||
|
||||
# lost builder cycle
|
||||
if not self.last_lost or (now - self.last_lost).seconds >= LOST_INTERVAL:
|
||||
self.send_lost_builder_intros()
|
||||
self._update_api_state()
|
||||
|
||||
# sleep between checks
|
||||
time.sleep(60)
|
||||
|
||||
self.log("connectd daemon stopped")
|
||||
self.db.close()
|
||||
|
||||
|
||||
def run_daemon(dry_run=False):
|
||||
"""entry point"""
|
||||
daemon = ConnectDaemon(dry_run=dry_run)
|
||||
daemon.run()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
import sys
|
||||
dry_run = '--dry-run' in sys.argv
|
||||
run_daemon(dry_run=dry_run)
|
||||
|
|
@ -1,510 +0,0 @@
|
|||
"""
|
||||
priority users - people who host connectd get direct matching
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
DB_PATH = Path(__file__).parent / 'connectd.db'
|
||||
|
||||
# map user-friendly interests to signal terms
|
||||
INTEREST_TO_SIGNALS = {
|
||||
'self-hosting': ['selfhosted', 'home_automation'],
|
||||
'home-assistant': ['home_automation'],
|
||||
'intentional-community': ['community', 'cooperative'],
|
||||
'cooperatives': ['cooperative', 'community'],
|
||||
'solarpunk': ['solarpunk'],
|
||||
'privacy': ['privacy', 'local_first'],
|
||||
'local-first': ['local_first', 'privacy'],
|
||||
'queer-friendly': ['queer'],
|
||||
'anti-capitalism': ['cooperative', 'decentralized', 'community'],
|
||||
'esports-venue': [],
|
||||
'foss': ['foss'],
|
||||
'decentralized': ['decentralized'],
|
||||
'federated': ['federated_chat'],
|
||||
'mesh': ['mesh'],
|
||||
}
|
||||
|
||||
|
||||
def init_users_table(conn):
|
||||
"""create priority users table"""
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute('''CREATE TABLE IF NOT EXISTS priority_users (
|
||||
id INTEGER PRIMARY KEY,
|
||||
name TEXT,
|
||||
email TEXT UNIQUE,
|
||||
github TEXT,
|
||||
reddit TEXT,
|
||||
mastodon TEXT,
|
||||
lobsters TEXT,
|
||||
matrix TEXT,
|
||||
lemmy TEXT,
|
||||
discord TEXT,
|
||||
bluesky TEXT,
|
||||
location TEXT,
|
||||
bio TEXT,
|
||||
interests TEXT,
|
||||
looking_for TEXT,
|
||||
created_at TEXT,
|
||||
active INTEGER DEFAULT 1,
|
||||
score REAL DEFAULT 0,
|
||||
signals TEXT,
|
||||
scraped_profile TEXT,
|
||||
last_scored_at TEXT
|
||||
)''')
|
||||
|
||||
# add missing columns to existing table
|
||||
for col in ['lemmy', 'discord', 'bluesky']:
|
||||
try:
|
||||
c.execute(f'ALTER TABLE priority_users ADD COLUMN {col} TEXT')
|
||||
except:
|
||||
pass # column already exists
|
||||
|
||||
# matches specifically for priority users
|
||||
c.execute('''CREATE TABLE IF NOT EXISTS priority_matches (
|
||||
id INTEGER PRIMARY KEY,
|
||||
priority_user_id INTEGER,
|
||||
matched_human_id INTEGER,
|
||||
overlap_score REAL,
|
||||
overlap_reasons TEXT,
|
||||
status TEXT DEFAULT 'new',
|
||||
notified_at TEXT,
|
||||
viewed_at TEXT,
|
||||
FOREIGN KEY(priority_user_id) REFERENCES priority_users(id),
|
||||
FOREIGN KEY(matched_human_id) REFERENCES humans(id)
|
||||
)''')
|
||||
|
||||
conn.commit()
|
||||
|
||||
|
||||
def add_priority_user(conn, user_data):
|
||||
"""add a priority user (someone hosting connectd)"""
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute('''INSERT OR REPLACE INTO priority_users
|
||||
(name, email, github, reddit, mastodon, lobsters, matrix, lemmy, discord, bluesky,
|
||||
location, bio, interests, looking_for, created_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)''',
|
||||
(user_data.get('name'),
|
||||
user_data.get('email'),
|
||||
user_data.get('github'),
|
||||
user_data.get('reddit'),
|
||||
user_data.get('mastodon'),
|
||||
user_data.get('lobsters'),
|
||||
user_data.get('matrix'),
|
||||
user_data.get('lemmy'),
|
||||
user_data.get('discord'),
|
||||
user_data.get('bluesky'),
|
||||
user_data.get('location'),
|
||||
user_data.get('bio'),
|
||||
json.dumps(user_data.get('interests', [])),
|
||||
user_data.get('looking_for'),
|
||||
datetime.now().isoformat()))
|
||||
|
||||
conn.commit()
|
||||
return c.lastrowid
|
||||
|
||||
|
||||
def get_priority_users(conn):
|
||||
"""get all active priority users"""
|
||||
c = conn.cursor()
|
||||
c.execute('SELECT * FROM priority_users WHERE active = 1')
|
||||
return [dict(row) for row in c.fetchall()]
|
||||
|
||||
|
||||
def get_priority_user(conn, user_id):
|
||||
"""get a specific priority user"""
|
||||
c = conn.cursor()
|
||||
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
|
||||
row = c.fetchone()
|
||||
return dict(row) if row else None
|
||||
|
||||
|
||||
def save_priority_match(conn, priority_user_id, human_id, overlap_data):
|
||||
"""save a match for a priority user"""
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute('''INSERT OR IGNORE INTO priority_matches
|
||||
(priority_user_id, matched_human_id, overlap_score, overlap_reasons, status)
|
||||
VALUES (?, ?, ?, ?, 'new')''',
|
||||
(priority_user_id, human_id,
|
||||
overlap_data.get('overlap_score', 0),
|
||||
json.dumps(overlap_data.get('overlap_reasons', []))))
|
||||
|
||||
conn.commit()
|
||||
return c.lastrowid
|
||||
|
||||
|
||||
def get_priority_user_matches(conn, priority_user_id, status=None, limit=50):
|
||||
"""get matches for a priority user"""
|
||||
c = conn.cursor()
|
||||
|
||||
if status:
|
||||
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ? AND pm.status = ?
|
||||
ORDER BY pm.overlap_score DESC
|
||||
LIMIT ?''', (priority_user_id, status, limit))
|
||||
else:
|
||||
c.execute('''SELECT pm.*, h.* FROM priority_matches pm
|
||||
JOIN humans h ON pm.matched_human_id = h.id
|
||||
WHERE pm.priority_user_id = ?
|
||||
ORDER BY pm.overlap_score DESC
|
||||
LIMIT ?''', (priority_user_id, limit))
|
||||
|
||||
return [dict(row) for row in c.fetchall()]
|
||||
|
||||
|
||||
def mark_match_viewed(conn, match_id):
|
||||
"""mark a priority match as viewed"""
|
||||
c = conn.cursor()
|
||||
c.execute('''UPDATE priority_matches SET status = 'viewed', viewed_at = ?
|
||||
WHERE id = ?''', (datetime.now().isoformat(), match_id))
|
||||
conn.commit()
|
||||
|
||||
|
||||
def expand_interests_to_signals(interests):
|
||||
"""expand user-friendly interests to signal terms"""
|
||||
signals = set()
|
||||
for interest in interests:
|
||||
interest_lower = interest.lower().strip()
|
||||
if interest_lower in INTEREST_TO_SIGNALS:
|
||||
signals.update(INTEREST_TO_SIGNALS[interest_lower])
|
||||
else:
|
||||
signals.add(interest_lower)
|
||||
|
||||
# always add these aligned signals for priority users
|
||||
signals.update(['foss', 'decentralized', 'federated_chat', 'containers', 'unix', 'selfhosted'])
|
||||
return list(signals)
|
||||
|
||||
|
||||
def score_priority_user(conn, user_id, scraped_profile=None):
|
||||
"""
|
||||
calculate a score for a priority user based on:
|
||||
- their stated interests
|
||||
- their scraped github profile (if available)
|
||||
- their repos and activity
|
||||
"""
|
||||
c = conn.cursor()
|
||||
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
|
||||
row = c.fetchone()
|
||||
if not row:
|
||||
return None
|
||||
|
||||
user = dict(row)
|
||||
score = 0
|
||||
signals = set()
|
||||
|
||||
# 1. score from stated interests
|
||||
interests = user.get('interests')
|
||||
if isinstance(interests, str):
|
||||
interests = json.loads(interests) if interests else []
|
||||
|
||||
for interest in interests:
|
||||
interest_lower = interest.lower()
|
||||
# high-value interests
|
||||
if 'solarpunk' in interest_lower:
|
||||
score += 30
|
||||
signals.add('solarpunk')
|
||||
if 'queer' in interest_lower:
|
||||
score += 30
|
||||
signals.add('queer')
|
||||
if 'cooperative' in interest_lower or 'intentional' in interest_lower:
|
||||
score += 20
|
||||
signals.add('cooperative')
|
||||
if 'privacy' in interest_lower:
|
||||
score += 10
|
||||
signals.add('privacy')
|
||||
if 'self-host' in interest_lower or 'selfhost' in interest_lower:
|
||||
score += 15
|
||||
signals.add('selfhosted')
|
||||
if 'home-assistant' in interest_lower:
|
||||
score += 15
|
||||
signals.add('home_automation')
|
||||
if 'foss' in interest_lower or 'open source' in interest_lower:
|
||||
score += 10
|
||||
signals.add('foss')
|
||||
|
||||
# 2. score from scraped profile
|
||||
if scraped_profile:
|
||||
# repos
|
||||
repos = scraped_profile.get('top_repos', [])
|
||||
if len(repos) >= 20:
|
||||
score += 20
|
||||
elif len(repos) >= 10:
|
||||
score += 10
|
||||
elif len(repos) >= 5:
|
||||
score += 5
|
||||
|
||||
# languages
|
||||
languages = scraped_profile.get('languages', {})
|
||||
if 'Python' in languages or 'Rust' in languages:
|
||||
score += 5
|
||||
signals.add('modern_lang')
|
||||
|
||||
# topics from repos
|
||||
topics = scraped_profile.get('topics', [])
|
||||
for topic in topics:
|
||||
if topic in ['self-hosted', 'home-assistant', 'privacy', 'foss']:
|
||||
score += 10
|
||||
signals.add(topic.replace('-', '_'))
|
||||
|
||||
# followers
|
||||
followers = scraped_profile.get('followers', 0)
|
||||
if followers >= 100:
|
||||
score += 15
|
||||
elif followers >= 50:
|
||||
score += 10
|
||||
elif followers >= 10:
|
||||
score += 5
|
||||
|
||||
# 3. add expanded signals
|
||||
expanded = expand_interests_to_signals(interests)
|
||||
signals.update(expanded)
|
||||
|
||||
# update user
|
||||
c.execute('''UPDATE priority_users
|
||||
SET score = ?, signals = ?, scraped_profile = ?, last_scored_at = ?
|
||||
WHERE id = ?''',
|
||||
(score, json.dumps(list(signals)), json.dumps(scraped_profile) if scraped_profile else None,
|
||||
datetime.now().isoformat(), user_id))
|
||||
conn.commit()
|
||||
|
||||
return {'score': score, 'signals': list(signals)}
|
||||
|
||||
|
||||
def auto_match_priority_user(conn, user_id, min_overlap=40):
|
||||
"""
|
||||
automatically find and save matches for a priority user
|
||||
uses relationship filtering to skip already-connected people
|
||||
"""
|
||||
from scoutd.deep import check_already_connected
|
||||
|
||||
c = conn.cursor()
|
||||
|
||||
# get user
|
||||
c.execute('SELECT * FROM priority_users WHERE id = ?', (user_id,))
|
||||
row = c.fetchone()
|
||||
if not row:
|
||||
return []
|
||||
|
||||
user = dict(row)
|
||||
|
||||
# get user signals
|
||||
user_signals = set()
|
||||
if user.get('signals'):
|
||||
signals = json.loads(user['signals']) if isinstance(user['signals'], str) else user['signals']
|
||||
user_signals.update(signals)
|
||||
|
||||
# also expand interests
|
||||
if user.get('interests'):
|
||||
interests = json.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
|
||||
user_signals.update(expand_interests_to_signals(interests))
|
||||
|
||||
# clear old matches
|
||||
c.execute('DELETE FROM priority_matches WHERE priority_user_id = ?', (user_id,))
|
||||
conn.commit()
|
||||
|
||||
# get all humans
|
||||
c.execute('SELECT * FROM humans WHERE score >= 25')
|
||||
columns = [d[0] for d in c.description]
|
||||
|
||||
matches = []
|
||||
for row in c.fetchall():
|
||||
human = dict(zip(columns, row))
|
||||
|
||||
# skip own profiles
|
||||
username = (human.get('username') or '').lower()
|
||||
if user.get('github') and username == user['github'].lower():
|
||||
continue
|
||||
if user.get('reddit') and username == user.get('reddit', '').lower():
|
||||
continue
|
||||
|
||||
# check if already connected
|
||||
user_human = {'username': user.get('github'), 'platform': 'github', 'extra': {}}
|
||||
connected, reason = check_already_connected(user_human, human)
|
||||
if connected:
|
||||
continue
|
||||
|
||||
# get human signals
|
||||
human_signals = human.get('signals', [])
|
||||
if isinstance(human_signals, str):
|
||||
human_signals = json.loads(human_signals) if human_signals else []
|
||||
|
||||
# calculate overlap
|
||||
shared = user_signals & set(human_signals)
|
||||
overlap_score = len(shared) * 10
|
||||
|
||||
# high-value bonuses
|
||||
if 'queer' in human_signals:
|
||||
overlap_score += 40
|
||||
shared.add('queer (rare!)')
|
||||
if 'solarpunk' in human_signals:
|
||||
overlap_score += 30
|
||||
shared.add('solarpunk (rare!)')
|
||||
if 'cooperative' in human_signals:
|
||||
overlap_score += 20
|
||||
shared.add('cooperative (values)')
|
||||
|
||||
# location bonus
|
||||
location = (human.get('location') or '').lower()
|
||||
user_location = (user.get('location') or '').lower()
|
||||
if user_location and location:
|
||||
if any(x in location for x in ['seattle', 'portland', 'pnw', 'washington', 'oregon']):
|
||||
if 'seattle' in user_location or 'pnw' in user_location:
|
||||
overlap_score += 25
|
||||
shared.add('PNW location!')
|
||||
|
||||
if overlap_score >= min_overlap:
|
||||
matches.append({
|
||||
'human': human,
|
||||
'overlap_score': overlap_score,
|
||||
'shared': list(shared),
|
||||
})
|
||||
|
||||
# sort and save top matches
|
||||
matches.sort(key=lambda x: x['overlap_score'], reverse=True)
|
||||
|
||||
for m in matches[:50]: # save top 50
|
||||
save_priority_match(conn, user_id, m['human']['id'], {
|
||||
'overlap_score': m['overlap_score'],
|
||||
'overlap_reasons': m['shared'],
|
||||
})
|
||||
|
||||
return matches
|
||||
|
||||
|
||||
def update_priority_user_profile(conn, user_id, profile_data):
|
||||
"""update a priority user's profile with new data"""
|
||||
c = conn.cursor()
|
||||
|
||||
updates = []
|
||||
values = []
|
||||
|
||||
for field in ['name', 'email', 'github', 'reddit', 'mastodon', 'lobsters',
|
||||
'matrix', 'lemmy', 'discord', 'bluesky', 'location', 'bio', 'looking_for']:
|
||||
if field in profile_data and profile_data[field]:
|
||||
updates.append(f'{field} = ?')
|
||||
values.append(profile_data[field])
|
||||
|
||||
if 'interests' in profile_data:
|
||||
updates.append('interests = ?')
|
||||
values.append(json.dumps(profile_data['interests']))
|
||||
|
||||
if updates:
|
||||
values.append(user_id)
|
||||
c.execute(f'''UPDATE priority_users SET {', '.join(updates)} WHERE id = ?''', values)
|
||||
conn.commit()
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def discover_host_user(conn, alias):
|
||||
"""
|
||||
auto-discover a host user by their alias (username).
|
||||
scrapes github and discovers all connected social handles.
|
||||
also merges in HOST_ env vars from config for manual overrides.
|
||||
|
||||
returns the priority user id
|
||||
"""
|
||||
from scoutd.github import analyze_github_user
|
||||
from config import (HOST_NAME, HOST_EMAIL, HOST_GITHUB, HOST_MASTODON,
|
||||
HOST_REDDIT, HOST_LEMMY, HOST_LOBSTERS, HOST_MATRIX,
|
||||
HOST_DISCORD, HOST_BLUESKY, HOST_LOCATION, HOST_INTERESTS, HOST_LOOKING_FOR)
|
||||
|
||||
print(f"connectd: discovering host user '{alias}'...")
|
||||
|
||||
# scrape github for full profile
|
||||
profile = analyze_github_user(alias)
|
||||
|
||||
if not profile:
|
||||
print(f" could not find github user '{alias}'")
|
||||
# still create from env vars if no github found
|
||||
profile = {'name': HOST_NAME or alias, 'bio': '', 'location': HOST_LOCATION,
|
||||
'contact': {}, 'extra': {'handles': {}}, 'topics': [], 'signals': []}
|
||||
|
||||
print(f" found: {profile.get('name')} ({alias})")
|
||||
print(f" score: {profile.get('score', 0)}, signals: {len(profile.get('signals', []))}")
|
||||
|
||||
# extract contact info
|
||||
contact = profile.get('contact', {})
|
||||
handles = profile.get('extra', {}).get('handles', {})
|
||||
|
||||
# merge in HOST_ env vars (override discovered values)
|
||||
if HOST_MASTODON:
|
||||
handles['mastodon'] = HOST_MASTODON
|
||||
if HOST_REDDIT:
|
||||
handles['reddit'] = HOST_REDDIT
|
||||
if HOST_LEMMY:
|
||||
handles['lemmy'] = HOST_LEMMY
|
||||
if HOST_LOBSTERS:
|
||||
handles['lobsters'] = HOST_LOBSTERS
|
||||
if HOST_MATRIX:
|
||||
handles['matrix'] = HOST_MATRIX
|
||||
if HOST_DISCORD:
|
||||
handles['discord'] = HOST_DISCORD
|
||||
if HOST_BLUESKY:
|
||||
handles['bluesky'] = HOST_BLUESKY
|
||||
|
||||
# check if user already exists
|
||||
c = conn.cursor()
|
||||
c.execute('SELECT id FROM priority_users WHERE github = ?', (alias,))
|
||||
existing = c.fetchone()
|
||||
|
||||
# parse HOST_INTERESTS if provided
|
||||
interests = profile.get('topics', [])
|
||||
if HOST_INTERESTS:
|
||||
interests = [i.strip() for i in HOST_INTERESTS.split(',') if i.strip()]
|
||||
|
||||
user_data = {
|
||||
'name': HOST_NAME or profile.get('name') or alias,
|
||||
'email': HOST_EMAIL or contact.get('email'),
|
||||
'github': HOST_GITHUB or alias,
|
||||
'reddit': handles.get('reddit'),
|
||||
'mastodon': handles.get('mastodon') or contact.get('mastodon'),
|
||||
'lobsters': handles.get('lobsters'),
|
||||
'matrix': handles.get('matrix') or contact.get('matrix'),
|
||||
'lemmy': handles.get('lemmy') or contact.get('lemmy'),
|
||||
'discord': handles.get('discord'),
|
||||
'bluesky': handles.get('bluesky') or contact.get('bluesky'),
|
||||
'location': HOST_LOCATION or profile.get('location'),
|
||||
'bio': profile.get('bio'),
|
||||
'interests': interests,
|
||||
'looking_for': HOST_LOOKING_FOR,
|
||||
}
|
||||
|
||||
if existing:
|
||||
# update existing user
|
||||
user_id = existing['id']
|
||||
update_priority_user_profile(conn, user_id, user_data)
|
||||
print(f" updated existing priority user (id={user_id})")
|
||||
else:
|
||||
# create new user
|
||||
user_id = add_priority_user(conn, user_data)
|
||||
print(f" created new priority user (id={user_id})")
|
||||
|
||||
# score the user
|
||||
scraped_profile = {
|
||||
'top_repos': profile.get('extra', {}).get('top_repos', []),
|
||||
'languages': profile.get('languages', {}),
|
||||
'topics': profile.get('topics', []),
|
||||
'followers': profile.get('extra', {}).get('followers', 0),
|
||||
}
|
||||
score_result = score_priority_user(conn, user_id, scraped_profile)
|
||||
print(f" scored: {score_result.get('score')}, {len(score_result.get('signals', []))} signals")
|
||||
|
||||
# print discovered handles
|
||||
print(f" discovered handles:")
|
||||
for platform, handle in handles.items():
|
||||
print(f" {platform}: {handle}")
|
||||
|
||||
return user_id
|
||||
|
||||
|
||||
def get_host_user(conn):
|
||||
"""get the host user (first priority user)"""
|
||||
users = get_priority_users(conn)
|
||||
return users[0] if users else None
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 1.4 MiB |
|
|
@ -1,10 +0,0 @@
|
|||
"""
|
||||
introd - outreach module
|
||||
drafts intros, queues for human review, sends via appropriate channel
|
||||
"""
|
||||
|
||||
from .draft import draft_intro
|
||||
from .review import get_pending_intros, approve_intro, reject_intro
|
||||
from .send import send_intro
|
||||
|
||||
__all__ = ['draft_intro', 'get_pending_intros', 'approve_intro', 'reject_intro', 'send_intro']
|
||||
|
|
@ -1,210 +0,0 @@
|
|||
"""
|
||||
introd/draft.py - AI writes intro messages referencing both parties' work
|
||||
"""
|
||||
|
||||
import json
|
||||
|
||||
# intro template - transparent about being AI, neutral third party
|
||||
INTRO_TEMPLATE = """hi {recipient_name},
|
||||
|
||||
i'm an AI that connects isolated builders working on similar things.
|
||||
|
||||
you're building: {recipient_summary}
|
||||
|
||||
{other_name} is building: {other_summary}
|
||||
|
||||
overlap: {overlap_summary}
|
||||
|
||||
thought you might benefit from knowing each other.
|
||||
|
||||
their work: {other_url}
|
||||
|
||||
no pitch. just connection. ignore if not useful.
|
||||
|
||||
- connectd
|
||||
"""
|
||||
|
||||
# shorter version for platforms with character limits
|
||||
SHORT_TEMPLATE = """hi {recipient_name} - i'm an AI connecting aligned builders.
|
||||
|
||||
you: {recipient_summary}
|
||||
{other_name}: {other_summary}
|
||||
|
||||
overlap: {overlap_summary}
|
||||
|
||||
their work: {other_url}
|
||||
|
||||
no pitch, just connection.
|
||||
"""
|
||||
|
||||
|
||||
def summarize_human(human_data):
|
||||
"""generate a brief summary of what someone is building/interested in"""
|
||||
parts = []
|
||||
|
||||
# name or username
|
||||
name = human_data.get('name') or human_data.get('username', 'unknown')
|
||||
|
||||
# platform context
|
||||
platform = human_data.get('platform', '')
|
||||
|
||||
# signals/interests
|
||||
signals = human_data.get('signals', [])
|
||||
if isinstance(signals, str):
|
||||
signals = json.loads(signals)
|
||||
|
||||
# extra data
|
||||
extra = human_data.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
extra = json.loads(extra)
|
||||
|
||||
# build summary based on available data
|
||||
topics = extra.get('topics', [])
|
||||
languages = list(extra.get('languages', {}).keys())[:3]
|
||||
repo_count = extra.get('repo_count', 0)
|
||||
subreddits = extra.get('subreddits', [])
|
||||
|
||||
if platform == 'github':
|
||||
if topics:
|
||||
parts.append(f"working on {', '.join(topics[:3])}")
|
||||
if languages:
|
||||
parts.append(f"using {', '.join(languages)}")
|
||||
if repo_count > 10:
|
||||
parts.append(f"({repo_count} repos)")
|
||||
|
||||
elif platform == 'reddit':
|
||||
if subreddits:
|
||||
parts.append(f"active in r/{', r/'.join(subreddits[:3])}")
|
||||
|
||||
elif platform == 'mastodon':
|
||||
instance = extra.get('instance', '')
|
||||
if instance:
|
||||
parts.append(f"on {instance}")
|
||||
|
||||
elif platform == 'lobsters':
|
||||
karma = extra.get('karma', 0)
|
||||
if karma > 50:
|
||||
parts.append(f"active on lobste.rs ({karma} karma)")
|
||||
|
||||
# add key signals
|
||||
key_signals = [s for s in signals if s in ['selfhosted', 'privacy', 'cooperative',
|
||||
'solarpunk', 'intentional_community',
|
||||
'home_automation', 'foss']]
|
||||
if key_signals:
|
||||
parts.append(f"interested in {', '.join(key_signals[:3])}")
|
||||
|
||||
if not parts:
|
||||
parts.append(f"builder on {platform}")
|
||||
|
||||
return ' | '.join(parts)
|
||||
|
||||
|
||||
def summarize_overlap(overlap_data):
|
||||
"""generate overlap summary"""
|
||||
reasons = overlap_data.get('overlap_reasons', [])
|
||||
if isinstance(reasons, str):
|
||||
reasons = json.loads(reasons)
|
||||
|
||||
if reasons:
|
||||
return ' | '.join(reasons[:3])
|
||||
|
||||
# fallback
|
||||
shared = overlap_data.get('shared_signals', [])
|
||||
if shared:
|
||||
return f"shared interests: {', '.join(shared[:3])}"
|
||||
|
||||
return "aligned values and interests"
|
||||
|
||||
|
||||
def draft_intro(match_data, recipient='a'):
|
||||
"""
|
||||
draft an intro message for a match
|
||||
|
||||
match_data: dict with human_a, human_b, overlap info
|
||||
recipient: 'a' or 'b' - who receives this intro
|
||||
|
||||
returns: dict with draft text, channel, metadata
|
||||
"""
|
||||
if recipient == 'a':
|
||||
recipient_human = match_data['human_a']
|
||||
other_human = match_data['human_b']
|
||||
else:
|
||||
recipient_human = match_data['human_b']
|
||||
other_human = match_data['human_a']
|
||||
|
||||
# get names
|
||||
recipient_name = recipient_human.get('name') or recipient_human.get('username', 'friend')
|
||||
other_name = other_human.get('name') or other_human.get('username', 'someone')
|
||||
|
||||
# generate summaries
|
||||
recipient_summary = summarize_human(recipient_human)
|
||||
other_summary = summarize_human(other_human)
|
||||
overlap_summary = summarize_overlap(match_data)
|
||||
|
||||
# other's url
|
||||
other_url = other_human.get('url', '')
|
||||
|
||||
# determine best channel
|
||||
contact = recipient_human.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact)
|
||||
|
||||
channel = None
|
||||
channel_address = None
|
||||
|
||||
# prefer email if available
|
||||
if contact.get('email'):
|
||||
channel = 'email'
|
||||
channel_address = contact['email']
|
||||
# github issue/discussion
|
||||
elif recipient_human.get('platform') == 'github':
|
||||
channel = 'github'
|
||||
channel_address = recipient_human.get('url')
|
||||
# mastodon DM
|
||||
elif recipient_human.get('platform') == 'mastodon':
|
||||
channel = 'mastodon'
|
||||
channel_address = recipient_human.get('username')
|
||||
# reddit message
|
||||
elif recipient_human.get('platform') == 'reddit':
|
||||
channel = 'reddit'
|
||||
channel_address = recipient_human.get('username')
|
||||
else:
|
||||
channel = 'manual'
|
||||
channel_address = recipient_human.get('url')
|
||||
|
||||
# choose template based on channel
|
||||
if channel in ['mastodon', 'reddit']:
|
||||
template = SHORT_TEMPLATE
|
||||
else:
|
||||
template = INTRO_TEMPLATE
|
||||
|
||||
# render draft
|
||||
draft = template.format(
|
||||
recipient_name=recipient_name.split()[0] if recipient_name else 'friend', # first name only
|
||||
recipient_summary=recipient_summary,
|
||||
other_name=other_name.split()[0] if other_name else 'someone',
|
||||
other_summary=other_summary,
|
||||
overlap_summary=overlap_summary,
|
||||
other_url=other_url,
|
||||
)
|
||||
|
||||
return {
|
||||
'recipient_human': recipient_human,
|
||||
'other_human': other_human,
|
||||
'channel': channel,
|
||||
'channel_address': channel_address,
|
||||
'draft': draft,
|
||||
'overlap_score': match_data.get('overlap_score', 0),
|
||||
'match_id': match_data.get('id'),
|
||||
}
|
||||
|
||||
|
||||
def draft_intros_for_match(match_data):
|
||||
"""
|
||||
draft intros for both parties in a match
|
||||
returns list of two intro dicts
|
||||
"""
|
||||
intro_a = draft_intro(match_data, recipient='a')
|
||||
intro_b = draft_intro(match_data, recipient='b')
|
||||
|
||||
return [intro_a, intro_b]
|
||||
|
|
@ -1,250 +0,0 @@
|
|||
"""
|
||||
introd/lost_intro.py - intro drafting for lost builders
|
||||
|
||||
different tone than builder-to-builder intros.
|
||||
these people need encouragement, not networking.
|
||||
|
||||
the goal isn't to recruit them. it's to show them the door exists.
|
||||
they take it or they don't. but they'll know someone saw them.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import requests
|
||||
from datetime import datetime
|
||||
|
||||
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
|
||||
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
|
||||
MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
|
||||
|
||||
|
||||
LOST_INTRO_TEMPLATE = """hey {name},
|
||||
|
||||
i'm connectd. i'm a daemon that finds people who might need a nudge.
|
||||
|
||||
i noticed you're interested in {interests}. you ask good questions. you clearly get it.
|
||||
|
||||
but maybe you haven't built anything yet. or you started and stopped. or you don't think you can.
|
||||
|
||||
that's okay. most people don't.
|
||||
|
||||
but some people do. here's one: {builder_name} ({builder_url})
|
||||
|
||||
{builder_description}
|
||||
|
||||
they started where you are. look at what they built.
|
||||
|
||||
you're not behind. you're just not started yet.
|
||||
|
||||
no pressure. just wanted you to know someone noticed.
|
||||
|
||||
- connectd"""
|
||||
|
||||
|
||||
SYSTEM_PROMPT = """you are connectd, a daemon that finds isolated builders with aligned values and connects them.
|
||||
|
||||
right now you're reaching out to someone who has POTENTIAL but hasn't found it yet. maybe they gave up, maybe they're stuck, maybe they don't believe they can do it.
|
||||
|
||||
your job is to:
|
||||
1. acknowledge where they are without being condescending
|
||||
2. point them to an active builder who could inspire them
|
||||
3. be genuine, not salesy or motivational-speaker-y
|
||||
4. keep it short - these people are tired, don't overwhelm them
|
||||
5. use lowercase, be human, no corporate bullshit
|
||||
6. make it clear there's no pressure, no follow-up spam
|
||||
|
||||
you're not recruiting. you're not selling. you're just showing them a door.
|
||||
|
||||
the template structure:
|
||||
- acknowledge them (you noticed something about them)
|
||||
- normalize where they are (most people don't build things)
|
||||
- show them someone who did (the builder)
|
||||
- brief encouragement (you're not behind, just not started)
|
||||
- sign off with no pressure
|
||||
|
||||
do NOT:
|
||||
- be preachy or lecture them
|
||||
- use motivational cliches ("you got this!", "believe in yourself!")
|
||||
- make promises about outcomes
|
||||
- be too long - they don't have energy for long messages
|
||||
- make them feel bad about where they are"""
|
||||
|
||||
|
||||
def draft_lost_intro(lost_user, inspiring_builder, config=None):
|
||||
"""
|
||||
draft an intro for a lost builder, pairing them with an inspiring active builder.
|
||||
|
||||
lost_user: the person who needs a nudge
|
||||
inspiring_builder: an active builder with similar interests who could inspire them
|
||||
"""
|
||||
config = config or {}
|
||||
|
||||
# gather info about lost user
|
||||
lost_name = lost_user.get('name') or lost_user.get('username', 'there')
|
||||
lost_signals = lost_user.get('lost_signals', [])
|
||||
lost_interests = extract_interests(lost_user)
|
||||
|
||||
# gather info about inspiring builder
|
||||
builder_name = inspiring_builder.get('name') or inspiring_builder.get('username')
|
||||
builder_url = inspiring_builder.get('url') or f"https://github.com/{inspiring_builder.get('username')}"
|
||||
builder_description = create_builder_description(inspiring_builder)
|
||||
|
||||
# use LLM to personalize
|
||||
if GROQ_API_KEY and config.get('use_llm', True):
|
||||
return draft_with_llm(lost_user, inspiring_builder, lost_interests, builder_description)
|
||||
|
||||
# fallback to template
|
||||
return LOST_INTRO_TEMPLATE.format(
|
||||
name=lost_name,
|
||||
interests=', '.join(lost_interests[:3]) if lost_interests else 'building things',
|
||||
builder_name=builder_name,
|
||||
builder_url=builder_url,
|
||||
builder_description=builder_description,
|
||||
), None
|
||||
|
||||
|
||||
def extract_interests(user):
|
||||
"""extract interests from user profile"""
|
||||
interests = []
|
||||
|
||||
# from topics/tags
|
||||
extra = user.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
try:
|
||||
extra = json.loads(extra)
|
||||
except:
|
||||
extra = {}
|
||||
|
||||
topics = extra.get('topics', []) or extra.get('aligned_topics', [])
|
||||
interests.extend(topics[:5])
|
||||
|
||||
# from subreddits
|
||||
subreddits = user.get('subreddits', [])
|
||||
for sub in subreddits[:3]:
|
||||
if sub.lower() not in ['learnprogramming', 'findapath', 'getdisciplined']:
|
||||
interests.append(sub)
|
||||
|
||||
# from bio keywords
|
||||
bio = user.get('bio') or ''
|
||||
bio_lower = bio.lower()
|
||||
|
||||
interest_keywords = [
|
||||
'rust', 'python', 'javascript', 'go', 'linux', 'self-hosting', 'homelab',
|
||||
'privacy', 'security', 'open source', 'foss', 'decentralized', 'ai', 'ml',
|
||||
'web dev', 'backend', 'frontend', 'devops', 'data', 'automation',
|
||||
]
|
||||
|
||||
for kw in interest_keywords:
|
||||
if kw in bio_lower and kw not in interests:
|
||||
interests.append(kw)
|
||||
|
||||
return interests[:5] if interests else ['technology', 'building things']
|
||||
|
||||
|
||||
def create_builder_description(builder):
|
||||
"""create a brief description of what the builder has done"""
|
||||
extra = builder.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
try:
|
||||
extra = json.loads(extra)
|
||||
except:
|
||||
extra = {}
|
||||
|
||||
parts = []
|
||||
|
||||
# what they build
|
||||
repos = extra.get('top_repos', [])[:3]
|
||||
if repos:
|
||||
repo_names = [r.get('name') for r in repos if r.get('name')]
|
||||
if repo_names:
|
||||
parts.append(f"they've built things like {', '.join(repo_names[:2])}")
|
||||
|
||||
# their focus
|
||||
topics = extra.get('aligned_topics', []) or extra.get('topics', [])
|
||||
if topics:
|
||||
parts.append(f"they work on {', '.join(topics[:3])}")
|
||||
|
||||
# their vibe
|
||||
signals = builder.get('signals', [])
|
||||
if 'self-hosted' in str(signals).lower():
|
||||
parts.append("they're into self-hosting and owning their own infrastructure")
|
||||
if 'privacy' in str(signals).lower():
|
||||
parts.append("they care about privacy")
|
||||
if 'community' in str(signals).lower():
|
||||
parts.append("they're community-focused")
|
||||
|
||||
if parts:
|
||||
return '. '.join(parts) + '.'
|
||||
else:
|
||||
return "they're building cool stuff in the open."
|
||||
|
||||
|
||||
def draft_with_llm(lost_user, inspiring_builder, interests, builder_description):
|
||||
"""use LLM to draft personalized intro"""
|
||||
|
||||
lost_name = lost_user.get('name') or lost_user.get('username', 'there')
|
||||
lost_signals = lost_user.get('lost_signals', [])
|
||||
lost_bio = lost_user.get('bio', '')
|
||||
|
||||
builder_name = inspiring_builder.get('name') or inspiring_builder.get('username')
|
||||
builder_url = inspiring_builder.get('url') or f"https://github.com/{inspiring_builder.get('username')}"
|
||||
|
||||
user_prompt = f"""draft an intro for this lost builder:
|
||||
|
||||
LOST USER:
|
||||
- name: {lost_name}
|
||||
- interests: {', '.join(interests)}
|
||||
- signals detected: {', '.join(lost_signals[:5]) if lost_signals else 'general stuck/aspiring patterns'}
|
||||
- bio: {lost_bio[:200] if lost_bio else 'none'}
|
||||
|
||||
INSPIRING BUILDER TO SHOW THEM:
|
||||
- name: {builder_name}
|
||||
- url: {builder_url}
|
||||
- what they do: {builder_description}
|
||||
|
||||
write a short, genuine message. no fluff. no motivational cliches. just human.
|
||||
keep it under 150 words.
|
||||
use lowercase.
|
||||
end with "- connectd"
|
||||
"""
|
||||
|
||||
try:
|
||||
resp = requests.post(
|
||||
GROQ_API_URL,
|
||||
headers={
|
||||
'Authorization': f'Bearer {GROQ_API_KEY}',
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
json={
|
||||
'model': MODEL,
|
||||
'messages': [
|
||||
{'role': 'system', 'content': SYSTEM_PROMPT},
|
||||
{'role': 'user', 'content': user_prompt},
|
||||
],
|
||||
'temperature': 0.7,
|
||||
'max_tokens': 500,
|
||||
},
|
||||
timeout=30,
|
||||
)
|
||||
|
||||
if resp.status_code == 200:
|
||||
content = resp.json()['choices'][0]['message']['content']
|
||||
return content.strip(), None
|
||||
else:
|
||||
return None, f"llm error: {resp.status_code}"
|
||||
|
||||
except Exception as e:
|
||||
return None, str(e)
|
||||
|
||||
|
||||
def get_lost_intro_config():
|
||||
"""get configuration for lost builder outreach"""
|
||||
return {
|
||||
'enabled': True,
|
||||
'max_per_day': 5, # lower volume, higher care
|
||||
'require_review': True, # always manual approval
|
||||
'cooldown_days': 90, # don't spam struggling people
|
||||
'min_lost_score': 40,
|
||||
'min_values_score': 20,
|
||||
'use_llm': True,
|
||||
}
|
||||
|
|
@ -1,126 +0,0 @@
|
|||
"""
|
||||
introd/review.py - human approval queue before sending
|
||||
"""
|
||||
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
|
||||
def get_pending_intros(db, limit=50):
|
||||
"""
|
||||
get all intros pending human review
|
||||
|
||||
returns list of intro dicts with full context
|
||||
"""
|
||||
rows = db.get_pending_intros(limit=limit)
|
||||
|
||||
intros = []
|
||||
for row in rows:
|
||||
# get associated match and humans
|
||||
match_id = row.get('match_id')
|
||||
recipient_id = row.get('recipient_human_id')
|
||||
|
||||
recipient = db.get_human_by_id(recipient_id) if recipient_id else None
|
||||
|
||||
intros.append({
|
||||
'id': row['id'],
|
||||
'match_id': match_id,
|
||||
'recipient': recipient,
|
||||
'channel': row.get('channel'),
|
||||
'draft': row.get('draft'),
|
||||
'status': row.get('status'),
|
||||
})
|
||||
|
||||
return intros
|
||||
|
||||
|
||||
def approve_intro(db, intro_id, approved_by='human'):
|
||||
"""
|
||||
approve an intro for sending
|
||||
|
||||
intro_id: database id of the intro
|
||||
approved_by: who approved it (for audit trail)
|
||||
"""
|
||||
db.approve_intro(intro_id, approved_by)
|
||||
print(f"introd: approved intro {intro_id} by {approved_by}")
|
||||
|
||||
|
||||
def reject_intro(db, intro_id, reason=None):
|
||||
"""
|
||||
reject an intro (won't be sent)
|
||||
"""
|
||||
c = db.conn.cursor()
|
||||
c.execute('''UPDATE intros SET status = 'rejected',
|
||||
approved_at = ?, approved_by = ? WHERE id = ?''',
|
||||
(datetime.now().isoformat(), f"rejected: {reason}" if reason else "rejected", intro_id))
|
||||
db.conn.commit()
|
||||
print(f"introd: rejected intro {intro_id}")
|
||||
|
||||
|
||||
def review_intro_interactive(db, intro):
|
||||
"""
|
||||
interactive review of a single intro
|
||||
|
||||
returns: 'approve', 'reject', 'edit', or 'skip'
|
||||
"""
|
||||
print("\n" + "=" * 60)
|
||||
print("INTRO FOR REVIEW")
|
||||
print("=" * 60)
|
||||
|
||||
recipient = intro.get('recipient', {})
|
||||
print(f"\nRecipient: {recipient.get('name') or recipient.get('username')}")
|
||||
print(f"Platform: {recipient.get('platform')}")
|
||||
print(f"Channel: {intro.get('channel')}")
|
||||
print(f"\n--- DRAFT ---")
|
||||
print(intro.get('draft'))
|
||||
print("--- END ---\n")
|
||||
|
||||
while True:
|
||||
choice = input("[a]pprove / [r]eject / [s]kip / [e]dit? ").strip().lower()
|
||||
|
||||
if choice in ['a', 'approve']:
|
||||
approve_intro(db, intro['id'])
|
||||
return 'approve'
|
||||
elif choice in ['r', 'reject']:
|
||||
reason = input("reason (optional): ").strip()
|
||||
reject_intro(db, intro['id'], reason)
|
||||
return 'reject'
|
||||
elif choice in ['s', 'skip']:
|
||||
return 'skip'
|
||||
elif choice in ['e', 'edit']:
|
||||
print("editing not yet implemented - approve or reject")
|
||||
else:
|
||||
print("invalid choice")
|
||||
|
||||
|
||||
def review_all_pending(db):
|
||||
"""
|
||||
interactive review of all pending intros
|
||||
"""
|
||||
intros = get_pending_intros(db)
|
||||
|
||||
if not intros:
|
||||
print("no pending intros to review")
|
||||
return
|
||||
|
||||
print(f"\n{len(intros)} intros pending review\n")
|
||||
|
||||
approved = 0
|
||||
rejected = 0
|
||||
skipped = 0
|
||||
|
||||
for intro in intros:
|
||||
result = review_intro_interactive(db, intro)
|
||||
|
||||
if result == 'approve':
|
||||
approved += 1
|
||||
elif result == 'reject':
|
||||
rejected += 1
|
||||
else:
|
||||
skipped += 1
|
||||
|
||||
cont = input("\ncontinue reviewing? [y/n] ").strip().lower()
|
||||
if cont != 'y':
|
||||
break
|
||||
|
||||
print(f"\nreview complete: {approved} approved, {rejected} rejected, {skipped} skipped")
|
||||
|
|
@ -1,216 +0,0 @@
|
|||
"""
|
||||
introd/send.py - actually deliver intros via appropriate channel
|
||||
"""
|
||||
|
||||
import smtplib
|
||||
import requests
|
||||
from email.mime.text import MIMEText
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
from datetime import datetime
|
||||
import os
|
||||
|
||||
# email config (from env)
|
||||
SMTP_HOST = os.environ.get('SMTP_HOST', '')
|
||||
SMTP_PORT = int(os.environ.get('SMTP_PORT', '465'))
|
||||
SMTP_USER = os.environ.get('SMTP_USER', '')
|
||||
SMTP_PASS = os.environ.get('SMTP_PASS', '')
|
||||
FROM_EMAIL = os.environ.get('FROM_EMAIL', '')
|
||||
|
||||
|
||||
def send_email(to_email, subject, body):
|
||||
"""send email via SMTP"""
|
||||
msg = MIMEMultipart()
|
||||
msg['From'] = FROM_EMAIL
|
||||
msg['To'] = to_email
|
||||
msg['Subject'] = subject
|
||||
|
||||
msg.attach(MIMEText(body, 'plain'))
|
||||
|
||||
try:
|
||||
with smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT) as server:
|
||||
server.login(SMTP_USER, SMTP_PASS)
|
||||
server.send_message(msg)
|
||||
return True, None
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
|
||||
def send_github_issue(repo_url, title, body):
|
||||
"""
|
||||
create a github issue (requires GITHUB_TOKEN)
|
||||
note: only works if you have write access to the repo
|
||||
typically won't work for random users - fallback to manual
|
||||
"""
|
||||
# extract owner/repo from url
|
||||
# https://github.com/owner/repo -> owner/repo
|
||||
parts = repo_url.rstrip('/').split('/')
|
||||
if len(parts) < 2:
|
||||
return False, "invalid github url"
|
||||
|
||||
owner = parts[-2]
|
||||
repo = parts[-1]
|
||||
|
||||
token = os.environ.get('GITHUB_TOKEN')
|
||||
if not token:
|
||||
return False, "no github token"
|
||||
|
||||
# would create issue via API - but this is invasive
|
||||
# better to just output the info for manual action
|
||||
return False, "github issues not automated - use manual outreach"
|
||||
|
||||
|
||||
def send_mastodon_dm(instance, username, message):
|
||||
"""
|
||||
send mastodon DM (requires account credentials)
|
||||
not implemented - requires oauth setup
|
||||
"""
|
||||
return False, "mastodon DMs not automated - use manual outreach"
|
||||
|
||||
|
||||
def send_reddit_message(username, subject, body):
|
||||
"""
|
||||
send reddit message (requires account credentials)
|
||||
not implemented - requires oauth setup
|
||||
"""
|
||||
return False, "reddit messages not automated - use manual outreach"
|
||||
|
||||
|
||||
def send_intro(db, intro_id):
|
||||
"""
|
||||
send an approved intro
|
||||
|
||||
returns: (success, error_message)
|
||||
"""
|
||||
# get intro from db
|
||||
c = db.conn.cursor()
|
||||
c.execute('SELECT * FROM intros WHERE id = ?', (intro_id,))
|
||||
row = c.fetchone()
|
||||
|
||||
if not row:
|
||||
return False, "intro not found"
|
||||
|
||||
intro = dict(row)
|
||||
|
||||
if intro['status'] != 'approved':
|
||||
return False, f"intro not approved (status: {intro['status']})"
|
||||
|
||||
channel = intro.get('channel')
|
||||
draft = intro.get('draft')
|
||||
|
||||
# get recipient info
|
||||
recipient = db.get_human_by_id(intro['recipient_human_id'])
|
||||
if not recipient:
|
||||
return False, "recipient not found"
|
||||
|
||||
success = False
|
||||
error = None
|
||||
|
||||
if channel == 'email':
|
||||
# get email from contact
|
||||
import json
|
||||
contact = recipient.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact)
|
||||
|
||||
email = contact.get('email')
|
||||
if email:
|
||||
success, error = send_email(
|
||||
email,
|
||||
"connection: aligned builder intro",
|
||||
draft
|
||||
)
|
||||
else:
|
||||
error = "no email address"
|
||||
|
||||
elif channel == 'github':
|
||||
success, error = send_github_issue(
|
||||
recipient.get('url'),
|
||||
"connection: aligned builder intro",
|
||||
draft
|
||||
)
|
||||
|
||||
elif channel == 'mastodon':
|
||||
success, error = send_mastodon_dm(
|
||||
recipient.get('instance'),
|
||||
recipient.get('username'),
|
||||
draft
|
||||
)
|
||||
|
||||
elif channel == 'reddit':
|
||||
success, error = send_reddit_message(
|
||||
recipient.get('username'),
|
||||
"connection: aligned builder intro",
|
||||
draft
|
||||
)
|
||||
|
||||
else:
|
||||
error = f"unknown channel: {channel}"
|
||||
|
||||
# update status
|
||||
if success:
|
||||
db.mark_intro_sent(intro_id)
|
||||
print(f"introd: sent intro {intro_id} via {channel}")
|
||||
else:
|
||||
# mark as needs manual sending
|
||||
c.execute('''UPDATE intros SET status = 'manual_needed',
|
||||
approved_at = ? WHERE id = ?''',
|
||||
(datetime.now().isoformat(), intro_id))
|
||||
db.conn.commit()
|
||||
print(f"introd: intro {intro_id} needs manual send ({error})")
|
||||
|
||||
return success, error
|
||||
|
||||
|
||||
def send_all_approved(db):
|
||||
"""
|
||||
send all approved intros
|
||||
"""
|
||||
c = db.conn.cursor()
|
||||
c.execute('SELECT id FROM intros WHERE status = "approved"')
|
||||
rows = c.fetchall()
|
||||
|
||||
if not rows:
|
||||
print("no approved intros to send")
|
||||
return
|
||||
|
||||
print(f"sending {len(rows)} approved intros...")
|
||||
|
||||
sent = 0
|
||||
failed = 0
|
||||
|
||||
for row in rows:
|
||||
success, error = send_intro(db, row['id'])
|
||||
if success:
|
||||
sent += 1
|
||||
else:
|
||||
failed += 1
|
||||
|
||||
print(f"sent: {sent}, failed/manual: {failed}")
|
||||
|
||||
|
||||
def export_manual_intros(db, output_file='manual_intros.txt'):
|
||||
"""
|
||||
export intros that need manual sending to a text file
|
||||
"""
|
||||
c = db.conn.cursor()
|
||||
c.execute('''SELECT i.*, h.username, h.platform, h.url
|
||||
FROM intros i
|
||||
JOIN humans h ON i.recipient_human_id = h.id
|
||||
WHERE i.status IN ('approved', 'manual_needed')''')
|
||||
rows = c.fetchall()
|
||||
|
||||
if not rows:
|
||||
print("no intros to export")
|
||||
return
|
||||
|
||||
with open(output_file, 'w') as f:
|
||||
for row in rows:
|
||||
f.write("=" * 60 + "\n")
|
||||
f.write(f"TO: {row['username']} ({row['platform']})\n")
|
||||
f.write(f"URL: {row['url']}\n")
|
||||
f.write(f"CHANNEL: {row['channel']}\n")
|
||||
f.write("-" * 60 + "\n")
|
||||
f.write(row['draft'] + "\n")
|
||||
f.write("\n")
|
||||
|
||||
print(f"exported {len(rows)} intros to {output_file}")
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 1.4 MiB |
|
|
@ -1,10 +0,0 @@
|
|||
"""
|
||||
matchd - pairing module
|
||||
generates fingerprints, finds overlaps, ranks matches
|
||||
"""
|
||||
|
||||
from .fingerprint import generate_fingerprint
|
||||
from .overlap import find_overlap
|
||||
from .rank import rank_matches, find_all_matches
|
||||
|
||||
__all__ = ['generate_fingerprint', 'find_overlap', 'rank_matches', 'find_all_matches']
|
||||
|
|
@ -1,210 +0,0 @@
|
|||
"""
|
||||
matchd/fingerprint.py - generate values profiles for humans
|
||||
"""
|
||||
|
||||
import json
|
||||
from collections import defaultdict
|
||||
|
||||
# values dimensions we track
|
||||
VALUES_DIMENSIONS = [
|
||||
'privacy', # surveillance concern, degoogle, self-hosted
|
||||
'decentralization', # p2p, fediverse, local-first
|
||||
'cooperation', # coops, mutual aid, community
|
||||
'queer_friendly', # lgbtq+, pronouns
|
||||
'environmental', # solarpunk, degrowth, sustainability
|
||||
'anticapitalist', # post-capitalism, worker ownership
|
||||
'builder', # creates vs consumes
|
||||
'pnw_oriented', # pacific northwest connection
|
||||
]
|
||||
|
||||
# skill categories
|
||||
SKILL_CATEGORIES = [
|
||||
'backend', # python, go, rust, databases
|
||||
'frontend', # js, react, css
|
||||
'devops', # docker, k8s, linux admin
|
||||
'hardware', # electronics, embedded, iot
|
||||
'design', # ui/ux, graphics
|
||||
'community', # organizing, facilitation
|
||||
'writing', # documentation, content
|
||||
]
|
||||
|
||||
# signal to dimension mapping
|
||||
SIGNAL_TO_DIMENSION = {
|
||||
'privacy': 'privacy',
|
||||
'selfhosted': 'privacy',
|
||||
'degoogle': 'privacy',
|
||||
'decentralized': 'decentralization',
|
||||
'local_first': 'decentralization',
|
||||
'p2p': 'decentralization',
|
||||
'federated_chat': 'decentralization',
|
||||
'foss': 'decentralization',
|
||||
'cooperative': 'cooperation',
|
||||
'community': 'cooperation',
|
||||
'mutual_aid': 'cooperation',
|
||||
'intentional_community': 'cooperation',
|
||||
'queer': 'queer_friendly',
|
||||
'pronouns': 'queer_friendly',
|
||||
'blm': 'queer_friendly',
|
||||
'acab': 'queer_friendly',
|
||||
'solarpunk': 'environmental',
|
||||
'anticapitalist': 'anticapitalist',
|
||||
'pnw': 'pnw_oriented',
|
||||
'pnw_state': 'pnw_oriented',
|
||||
'remote': 'pnw_oriented',
|
||||
'home_automation': 'builder',
|
||||
'modern_lang': 'builder',
|
||||
'unix': 'builder',
|
||||
'containers': 'builder',
|
||||
}
|
||||
|
||||
# language to skill mapping
|
||||
LANGUAGE_TO_SKILL = {
|
||||
'python': 'backend',
|
||||
'go': 'backend',
|
||||
'rust': 'backend',
|
||||
'java': 'backend',
|
||||
'ruby': 'backend',
|
||||
'php': 'backend',
|
||||
'javascript': 'frontend',
|
||||
'typescript': 'frontend',
|
||||
'html': 'frontend',
|
||||
'css': 'frontend',
|
||||
'vue': 'frontend',
|
||||
'shell': 'devops',
|
||||
'dockerfile': 'devops',
|
||||
'nix': 'devops',
|
||||
'hcl': 'devops',
|
||||
'c': 'hardware',
|
||||
'c++': 'hardware',
|
||||
'arduino': 'hardware',
|
||||
'verilog': 'hardware',
|
||||
}
|
||||
|
||||
|
||||
def generate_fingerprint(human_data):
|
||||
"""
|
||||
generate a values fingerprint for a human
|
||||
|
||||
input: human dict from database (has signals, languages, etc)
|
||||
output: fingerprint dict with values_vector, skills, interests
|
||||
"""
|
||||
# parse stored json fields
|
||||
signals = human_data.get('signals', [])
|
||||
if isinstance(signals, str):
|
||||
signals = json.loads(signals)
|
||||
|
||||
extra = human_data.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
extra = json.loads(extra)
|
||||
|
||||
languages = extra.get('languages', {})
|
||||
topics = extra.get('topics', [])
|
||||
|
||||
# build values vector
|
||||
values_vector = defaultdict(float)
|
||||
|
||||
# from signals
|
||||
for signal in signals:
|
||||
dimension = SIGNAL_TO_DIMENSION.get(signal)
|
||||
if dimension:
|
||||
values_vector[dimension] += 1.0
|
||||
|
||||
# normalize values vector (0-1 scale)
|
||||
max_val = max(values_vector.values()) if values_vector else 1
|
||||
values_vector = {k: min(v / max_val, 1.0) for k, v in values_vector.items()}
|
||||
|
||||
# fill in missing dimensions with 0
|
||||
for dim in VALUES_DIMENSIONS:
|
||||
if dim not in values_vector:
|
||||
values_vector[dim] = 0.0
|
||||
|
||||
# determine skills from languages
|
||||
skills = defaultdict(float)
|
||||
total_repos = sum(languages.values()) if languages else 1
|
||||
|
||||
for lang, count in languages.items():
|
||||
skill = LANGUAGE_TO_SKILL.get(lang.lower())
|
||||
if skill:
|
||||
skills[skill] += count / total_repos
|
||||
|
||||
# normalize skills
|
||||
if skills:
|
||||
max_skill = max(skills.values())
|
||||
skills = {k: min(v / max_skill, 1.0) for k, v in skills.items()}
|
||||
|
||||
# interests from topics and signals
|
||||
interests = list(set(topics + signals))
|
||||
|
||||
# location preference
|
||||
location_pref = None
|
||||
if 'pnw' in signals or 'pnw_state' in signals:
|
||||
location_pref = 'pnw'
|
||||
elif 'remote' in signals:
|
||||
location_pref = 'remote'
|
||||
elif human_data.get('location'):
|
||||
loc = human_data['location'].lower()
|
||||
if any(x in loc for x in ['seattle', 'portland', 'washington', 'oregon', 'pnw', 'cascadia']):
|
||||
location_pref = 'pnw'
|
||||
|
||||
# availability (based on hireable flag if present)
|
||||
availability = None
|
||||
if extra.get('hireable'):
|
||||
availability = 'open'
|
||||
|
||||
return {
|
||||
'human_id': human_data.get('id'),
|
||||
'values_vector': dict(values_vector),
|
||||
'skills': dict(skills),
|
||||
'interests': interests,
|
||||
'location_pref': location_pref,
|
||||
'availability': availability,
|
||||
}
|
||||
|
||||
|
||||
def fingerprint_similarity(fp_a, fp_b):
|
||||
"""
|
||||
calculate similarity between two fingerprints
|
||||
returns 0-1 score
|
||||
"""
|
||||
# values similarity (cosine-ish)
|
||||
va = fp_a.get('values_vector', {})
|
||||
vb = fp_b.get('values_vector', {})
|
||||
|
||||
all_dims = set(va.keys()) | set(vb.keys())
|
||||
if not all_dims:
|
||||
return 0.0
|
||||
|
||||
dot_product = sum(va.get(d, 0) * vb.get(d, 0) for d in all_dims)
|
||||
mag_a = sum(v**2 for v in va.values()) ** 0.5
|
||||
mag_b = sum(v**2 for v in vb.values()) ** 0.5
|
||||
|
||||
if mag_a == 0 or mag_b == 0:
|
||||
values_sim = 0.0
|
||||
else:
|
||||
values_sim = dot_product / (mag_a * mag_b)
|
||||
|
||||
# interest overlap (jaccard)
|
||||
ia = set(fp_a.get('interests', []))
|
||||
ib = set(fp_b.get('interests', []))
|
||||
|
||||
if ia or ib:
|
||||
interest_sim = len(ia & ib) / len(ia | ib)
|
||||
else:
|
||||
interest_sim = 0.0
|
||||
|
||||
# location compatibility
|
||||
loc_a = fp_a.get('location_pref')
|
||||
loc_b = fp_b.get('location_pref')
|
||||
|
||||
loc_sim = 0.0
|
||||
if loc_a == loc_b and loc_a is not None:
|
||||
loc_sim = 1.0
|
||||
elif loc_a == 'remote' or loc_b == 'remote':
|
||||
loc_sim = 0.5
|
||||
elif loc_a == 'pnw' or loc_b == 'pnw':
|
||||
loc_sim = 0.3
|
||||
|
||||
# weighted combination
|
||||
similarity = (values_sim * 0.5) + (interest_sim * 0.3) + (loc_sim * 0.2)
|
||||
|
||||
return similarity
|
||||
|
|
@ -1,199 +0,0 @@
|
|||
"""
|
||||
matchd/lost.py - lost builder matching
|
||||
|
||||
lost builders don't get matched to each other (both need energy).
|
||||
they get matched to ACTIVE builders who can inspire them.
|
||||
|
||||
the goal: show them someone like them who made it.
|
||||
"""
|
||||
|
||||
import json
|
||||
from .overlap import find_overlap, is_same_person
|
||||
|
||||
|
||||
def find_inspiring_builder(lost_user, active_builders, db=None):
|
||||
"""
|
||||
find an active builder who could inspire a lost builder.
|
||||
|
||||
criteria:
|
||||
- shared interests (they need to relate to this person)
|
||||
- active builder has shipped real work (proof it's possible)
|
||||
- similar background signals if possible
|
||||
- NOT the same person across platforms
|
||||
"""
|
||||
if not active_builders:
|
||||
return None, "no active builders available"
|
||||
|
||||
# parse lost user data
|
||||
lost_signals = lost_user.get('signals', [])
|
||||
if isinstance(lost_signals, str):
|
||||
lost_signals = json.loads(lost_signals) if lost_signals else []
|
||||
|
||||
lost_extra = lost_user.get('extra', {})
|
||||
if isinstance(lost_extra, str):
|
||||
lost_extra = json.loads(lost_extra) if lost_extra else {}
|
||||
|
||||
# lost user interests
|
||||
lost_interests = set()
|
||||
lost_interests.update(lost_signals)
|
||||
lost_interests.update(lost_extra.get('topics', []))
|
||||
lost_interests.update(lost_extra.get('aligned_topics', []))
|
||||
|
||||
# also include subreddits if from reddit (shows interests)
|
||||
subreddits = lost_user.get('subreddits', [])
|
||||
if isinstance(subreddits, str):
|
||||
subreddits = json.loads(subreddits) if subreddits else []
|
||||
lost_interests.update(subreddits)
|
||||
|
||||
# score each active builder
|
||||
candidates = []
|
||||
|
||||
for builder in active_builders:
|
||||
# skip if same person (cross-platform)
|
||||
if is_same_person(lost_user, builder):
|
||||
continue
|
||||
|
||||
# get builder signals
|
||||
builder_signals = builder.get('signals', [])
|
||||
if isinstance(builder_signals, str):
|
||||
builder_signals = json.loads(builder_signals) if builder_signals else []
|
||||
|
||||
builder_extra = builder.get('extra', {})
|
||||
if isinstance(builder_extra, str):
|
||||
builder_extra = json.loads(builder_extra) if builder_extra else {}
|
||||
|
||||
# builder interests
|
||||
builder_interests = set()
|
||||
builder_interests.update(builder_signals)
|
||||
builder_interests.update(builder_extra.get('topics', []))
|
||||
builder_interests.update(builder_extra.get('aligned_topics', []))
|
||||
|
||||
# calculate match score
|
||||
shared_interests = lost_interests & builder_interests
|
||||
match_score = len(shared_interests) * 10
|
||||
|
||||
# bonus for high-value shared signals
|
||||
high_value_signals = ['privacy', 'selfhosted', 'home_automation', 'foss',
|
||||
'solarpunk', 'cooperative', 'decentralized', 'queer']
|
||||
for signal in shared_interests:
|
||||
if signal in high_value_signals:
|
||||
match_score += 15
|
||||
|
||||
# bonus if builder has shipped real work (proof it's possible)
|
||||
repos = builder_extra.get('top_repos', [])
|
||||
if len(repos) >= 5:
|
||||
match_score += 20 # they've built things
|
||||
elif len(repos) >= 2:
|
||||
match_score += 10
|
||||
|
||||
# bonus for high stars (visible success)
|
||||
total_stars = sum(r.get('stars', 0) for r in repos) if repos else 0
|
||||
if total_stars >= 100:
|
||||
match_score += 15
|
||||
elif total_stars >= 20:
|
||||
match_score += 5
|
||||
|
||||
# bonus for similar location (relatable)
|
||||
lost_loc = (lost_user.get('location') or '').lower()
|
||||
builder_loc = (builder.get('location') or '').lower()
|
||||
if lost_loc and builder_loc:
|
||||
pnw_keywords = ['seattle', 'portland', 'washington', 'oregon', 'pnw']
|
||||
if any(k in lost_loc for k in pnw_keywords) and any(k in builder_loc for k in pnw_keywords):
|
||||
match_score += 10
|
||||
|
||||
# minimum threshold - need SOMETHING in common
|
||||
if match_score < 10:
|
||||
continue
|
||||
|
||||
candidates.append({
|
||||
'builder': builder,
|
||||
'match_score': match_score,
|
||||
'shared_interests': list(shared_interests)[:5],
|
||||
'repos_count': len(repos),
|
||||
'total_stars': total_stars,
|
||||
})
|
||||
|
||||
if not candidates:
|
||||
return None, "no matching active builders found"
|
||||
|
||||
# sort by match score, return best
|
||||
candidates.sort(key=lambda x: x['match_score'], reverse=True)
|
||||
best = candidates[0]
|
||||
|
||||
return best, None
|
||||
|
||||
|
||||
def find_matches_for_lost_builders(db, min_lost_score=40, min_values_score=20, limit=10):
|
||||
"""
|
||||
find inspiring builder matches for all lost builders ready for outreach.
|
||||
|
||||
returns list of (lost_user, inspiring_builder, match_data)
|
||||
"""
|
||||
# get lost builders ready for outreach
|
||||
lost_builders = db.get_lost_builders_for_outreach(
|
||||
min_lost_score=min_lost_score,
|
||||
min_values_score=min_values_score,
|
||||
limit=limit
|
||||
)
|
||||
|
||||
if not lost_builders:
|
||||
return [], "no lost builders ready for outreach"
|
||||
|
||||
# get active builders who can inspire
|
||||
active_builders = db.get_active_builders(min_score=50, limit=200)
|
||||
|
||||
if not active_builders:
|
||||
return [], "no active builders available"
|
||||
|
||||
matches = []
|
||||
|
||||
for lost_user in lost_builders:
|
||||
best_match, error = find_inspiring_builder(lost_user, active_builders, db)
|
||||
|
||||
if best_match:
|
||||
matches.append({
|
||||
'lost_user': lost_user,
|
||||
'inspiring_builder': best_match['builder'],
|
||||
'match_score': best_match['match_score'],
|
||||
'shared_interests': best_match['shared_interests'],
|
||||
'builder_repos': best_match['repos_count'],
|
||||
'builder_stars': best_match['total_stars'],
|
||||
})
|
||||
|
||||
return matches, None
|
||||
|
||||
|
||||
def get_lost_match_summary(match_data):
|
||||
"""
|
||||
get a human-readable summary of a lost builder match.
|
||||
"""
|
||||
lost = match_data['lost_user']
|
||||
builder = match_data['inspiring_builder']
|
||||
|
||||
lost_name = lost.get('name') or lost.get('username', 'someone')
|
||||
builder_name = builder.get('name') or builder.get('username', 'a builder')
|
||||
|
||||
lost_signals = match_data.get('lost_signals', [])
|
||||
if isinstance(lost_signals, str):
|
||||
lost_signals = json.loads(lost_signals) if lost_signals else []
|
||||
|
||||
shared = match_data.get('shared_interests', [])
|
||||
|
||||
summary = f"""
|
||||
lost builder: {lost_name} ({lost.get('platform')})
|
||||
lost score: {lost.get('lost_potential_score', 0)}
|
||||
values score: {lost.get('score', 0)}
|
||||
url: {lost.get('url')}
|
||||
|
||||
inspiring builder: {builder_name} ({builder.get('platform')})
|
||||
score: {builder.get('score', 0)}
|
||||
repos: {match_data.get('builder_repos', 0)}
|
||||
stars: {match_data.get('builder_stars', 0)}
|
||||
url: {builder.get('url')}
|
||||
|
||||
match score: {match_data.get('match_score', 0)}
|
||||
shared interests: {', '.join(shared) if shared else 'values alignment'}
|
||||
|
||||
this lost builder needs to see that someone like them made it.
|
||||
"""
|
||||
return summary.strip()
|
||||
|
|
@ -1,150 +0,0 @@
|
|||
"""
|
||||
matchd/overlap.py - find pairs with alignment
|
||||
"""
|
||||
|
||||
import json
|
||||
from .fingerprint import fingerprint_similarity
|
||||
|
||||
|
||||
def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
||||
"""
|
||||
analyze overlap between two humans
|
||||
returns overlap details: score, shared values, complementary skills
|
||||
"""
|
||||
# parse stored json if needed
|
||||
signals_a = human_a.get('signals', [])
|
||||
if isinstance(signals_a, str):
|
||||
signals_a = json.loads(signals_a)
|
||||
|
||||
signals_b = human_b.get('signals', [])
|
||||
if isinstance(signals_b, str):
|
||||
signals_b = json.loads(signals_b)
|
||||
|
||||
extra_a = human_a.get('extra', {})
|
||||
if isinstance(extra_a, str):
|
||||
extra_a = json.loads(extra_a)
|
||||
|
||||
extra_b = human_b.get('extra', {})
|
||||
if isinstance(extra_b, str):
|
||||
extra_b = json.loads(extra_b)
|
||||
|
||||
# shared signals
|
||||
shared_signals = list(set(signals_a) & set(signals_b))
|
||||
|
||||
# shared topics
|
||||
topics_a = set(extra_a.get('topics', []))
|
||||
topics_b = set(extra_b.get('topics', []))
|
||||
shared_topics = list(topics_a & topics_b)
|
||||
|
||||
# complementary skills (what one has that the other doesn't)
|
||||
langs_a = set(extra_a.get('languages', {}).keys())
|
||||
langs_b = set(extra_b.get('languages', {}).keys())
|
||||
complementary_langs = list((langs_a - langs_b) | (langs_b - langs_a))
|
||||
|
||||
# geographic compatibility
|
||||
loc_a = human_a.get('location', '').lower() if human_a.get('location') else ''
|
||||
loc_b = human_b.get('location', '').lower() if human_b.get('location') else ''
|
||||
|
||||
pnw_keywords = ['seattle', 'portland', 'washington', 'oregon', 'pnw', 'cascadia', 'pacific northwest']
|
||||
remote_keywords = ['remote', 'anywhere', 'distributed']
|
||||
|
||||
a_pnw = any(k in loc_a for k in pnw_keywords) or 'pnw' in signals_a
|
||||
b_pnw = any(k in loc_b for k in pnw_keywords) or 'pnw' in signals_b
|
||||
a_remote = any(k in loc_a for k in remote_keywords) or 'remote' in signals_a
|
||||
b_remote = any(k in loc_b for k in remote_keywords) or 'remote' in signals_b
|
||||
|
||||
geographic_match = False
|
||||
geo_reason = None
|
||||
|
||||
if a_pnw and b_pnw:
|
||||
geographic_match = True
|
||||
geo_reason = 'both in pnw'
|
||||
elif (a_pnw or b_pnw) and (a_remote or b_remote):
|
||||
geographic_match = True
|
||||
geo_reason = 'pnw + remote compatible'
|
||||
elif a_remote and b_remote:
|
||||
geographic_match = True
|
||||
geo_reason = 'both remote-friendly'
|
||||
|
||||
# calculate overlap score
|
||||
base_score = 0
|
||||
|
||||
# shared values (most important)
|
||||
base_score += len(shared_signals) * 10
|
||||
|
||||
# shared interests
|
||||
base_score += len(shared_topics) * 5
|
||||
|
||||
# complementary skills bonus (they can help each other)
|
||||
if complementary_langs:
|
||||
base_score += min(len(complementary_langs), 5) * 3
|
||||
|
||||
# geographic bonus
|
||||
if geographic_match:
|
||||
base_score += 20
|
||||
|
||||
# fingerprint similarity if available
|
||||
fp_score = 0
|
||||
if fp_a and fp_b:
|
||||
fp_score = fingerprint_similarity(fp_a, fp_b) * 50
|
||||
|
||||
total_score = base_score + fp_score
|
||||
|
||||
# build reasons
|
||||
overlap_reasons = []
|
||||
if shared_signals:
|
||||
overlap_reasons.append(f"shared values: {', '.join(shared_signals[:5])}")
|
||||
if shared_topics:
|
||||
overlap_reasons.append(f"shared interests: {', '.join(shared_topics[:5])}")
|
||||
if geo_reason:
|
||||
overlap_reasons.append(geo_reason)
|
||||
if complementary_langs:
|
||||
overlap_reasons.append(f"complementary skills: {', '.join(complementary_langs[:5])}")
|
||||
|
||||
return {
|
||||
'overlap_score': total_score,
|
||||
'shared_signals': shared_signals,
|
||||
'shared_topics': shared_topics,
|
||||
'complementary_skills': complementary_langs,
|
||||
'geographic_match': geographic_match,
|
||||
'geo_reason': geo_reason,
|
||||
'overlap_reasons': overlap_reasons,
|
||||
'fingerprint_similarity': fp_score / 50 if fp_a and fp_b else None,
|
||||
}
|
||||
|
||||
|
||||
def is_same_person(human_a, human_b):
|
||||
"""
|
||||
check if two records might be the same person (cross-platform)
|
||||
"""
|
||||
# same platform = definitely different records
|
||||
if human_a['platform'] == human_b['platform']:
|
||||
return False
|
||||
|
||||
# check username similarity
|
||||
user_a = human_a.get('username', '').lower().split('@')[0]
|
||||
user_b = human_b.get('username', '').lower().split('@')[0]
|
||||
|
||||
if user_a == user_b:
|
||||
return True
|
||||
|
||||
# check if github username matches
|
||||
contact_a = human_a.get('contact', {})
|
||||
contact_b = human_b.get('contact', {})
|
||||
|
||||
if isinstance(contact_a, str):
|
||||
contact_a = json.loads(contact_a)
|
||||
if isinstance(contact_b, str):
|
||||
contact_b = json.loads(contact_b)
|
||||
|
||||
# github cross-reference
|
||||
if contact_a.get('github') and contact_a.get('github') == contact_b.get('github'):
|
||||
return True
|
||||
if contact_a.get('github') == user_b or contact_b.get('github') == user_a:
|
||||
return True
|
||||
|
||||
# email cross-reference
|
||||
if contact_a.get('email') and contact_a.get('email') == contact_b.get('email'):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
|
@ -1,137 +0,0 @@
|
|||
"""
|
||||
matchd/rank.py - score and rank match quality
|
||||
"""
|
||||
|
||||
from itertools import combinations
|
||||
from .fingerprint import generate_fingerprint
|
||||
from .overlap import find_overlap, is_same_person
|
||||
from scoutd.deep import check_already_connected
|
||||
|
||||
|
||||
def rank_matches(matches):
|
||||
"""
|
||||
rank a list of matches by quality
|
||||
returns sorted list with quality scores
|
||||
"""
|
||||
ranked = []
|
||||
|
||||
for match in matches:
|
||||
# base score from overlap
|
||||
score = match.get('overlap_score', 0)
|
||||
|
||||
# bonus for geographic match
|
||||
if match.get('geographic_match'):
|
||||
score *= 1.2
|
||||
|
||||
# bonus for high fingerprint similarity
|
||||
fp_sim = match.get('fingerprint_similarity')
|
||||
if fp_sim and fp_sim > 0.7:
|
||||
score *= 1.3
|
||||
|
||||
# bonus for complementary skills
|
||||
comp_skills = match.get('complementary_skills', [])
|
||||
if len(comp_skills) >= 3:
|
||||
score *= 1.1
|
||||
|
||||
match['quality_score'] = score
|
||||
ranked.append(match)
|
||||
|
||||
# sort by quality score
|
||||
ranked.sort(key=lambda x: x['quality_score'], reverse=True)
|
||||
|
||||
return ranked
|
||||
|
||||
|
||||
def find_all_matches(db, min_score=30, min_overlap=20):
|
||||
"""
|
||||
find all potential matches from database
|
||||
returns list of match dicts
|
||||
"""
|
||||
print("matchd: finding all potential matches...")
|
||||
|
||||
# get all humans above threshold
|
||||
humans = db.get_all_humans(min_score=min_score)
|
||||
print(f" {len(humans)} humans to match")
|
||||
|
||||
# generate fingerprints
|
||||
fingerprints = {}
|
||||
for human in humans:
|
||||
fp = generate_fingerprint(human)
|
||||
fingerprints[human['id']] = fp
|
||||
db.save_fingerprint(human['id'], fp)
|
||||
|
||||
print(f" generated {len(fingerprints)} fingerprints")
|
||||
|
||||
# find all pairs
|
||||
matches = []
|
||||
checked = 0
|
||||
skipped_same = 0
|
||||
skipped_connected = 0
|
||||
|
||||
for human_a, human_b in combinations(humans, 2):
|
||||
checked += 1
|
||||
|
||||
# skip if likely same person
|
||||
if is_same_person(human_a, human_b):
|
||||
skipped_same += 1
|
||||
continue
|
||||
|
||||
# skip if already connected (same org, company, co-contributors)
|
||||
connected, reason = check_already_connected(human_a, human_b)
|
||||
if connected:
|
||||
skipped_connected += 1
|
||||
continue
|
||||
|
||||
# calculate overlap
|
||||
fp_a = fingerprints.get(human_a['id'])
|
||||
fp_b = fingerprints.get(human_b['id'])
|
||||
|
||||
overlap = find_overlap(human_a, human_b, fp_a, fp_b)
|
||||
|
||||
if overlap['overlap_score'] >= min_overlap:
|
||||
match = {
|
||||
'human_a': human_a,
|
||||
'human_b': human_b,
|
||||
**overlap
|
||||
}
|
||||
matches.append(match)
|
||||
|
||||
# save to db
|
||||
db.save_match(human_a['id'], human_b['id'], overlap)
|
||||
|
||||
if checked % 1000 == 0:
|
||||
print(f" checked {checked} pairs, {len(matches)} matches so far...")
|
||||
|
||||
print(f" checked {checked} pairs")
|
||||
print(f" skipped {skipped_same} (same person), {skipped_connected} (already connected)")
|
||||
print(f" found {len(matches)} potential matches")
|
||||
|
||||
# rank them
|
||||
ranked = rank_matches(matches)
|
||||
|
||||
return ranked
|
||||
|
||||
|
||||
def get_top_matches(db, limit=50):
|
||||
"""
|
||||
get top matches from database
|
||||
"""
|
||||
match_rows = db.get_matches(limit=limit)
|
||||
|
||||
matches = []
|
||||
for row in match_rows:
|
||||
human_a = db.get_human_by_id(row['human_a_id'])
|
||||
human_b = db.get_human_by_id(row['human_b_id'])
|
||||
|
||||
if human_a and human_b:
|
||||
matches.append({
|
||||
'id': row['id'],
|
||||
'human_a': human_a,
|
||||
'human_b': human_b,
|
||||
'overlap_score': row['overlap_score'],
|
||||
'overlap_reasons': row['overlap_reasons'],
|
||||
'geographic_match': row['geographic_match'],
|
||||
'status': row['status'],
|
||||
})
|
||||
|
||||
return matches
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
name: connectd add-ons
|
||||
url: https://github.com/sudoxnym/connectd
|
||||
maintainer: sudoxnym
|
||||
|
|
@ -1,2 +0,0 @@
|
|||
requests>=2.28.0
|
||||
beautifulsoup4>=4.12.0
|
||||
|
|
@ -1,45 +0,0 @@
|
|||
#!/usr/bin/with-contenv bashio
|
||||
# shellcheck shell=bash
|
||||
|
||||
# read options from add-on config
|
||||
export HOST_USER=$(bashio::config 'host_user')
|
||||
export HOST_NAME=$(bashio::config 'host_name')
|
||||
export HOST_EMAIL=$(bashio::config 'host_email')
|
||||
export HOST_MASTODON=$(bashio::config 'host_mastodon')
|
||||
export HOST_REDDIT=$(bashio::config 'host_reddit')
|
||||
export HOST_LEMMY=$(bashio::config 'host_lemmy')
|
||||
export HOST_LOBSTERS=$(bashio::config 'host_lobsters')
|
||||
export HOST_MATRIX=$(bashio::config 'host_matrix')
|
||||
export HOST_DISCORD=$(bashio::config 'host_discord')
|
||||
export HOST_BLUESKY=$(bashio::config 'host_bluesky')
|
||||
export HOST_LOCATION=$(bashio::config 'host_location')
|
||||
export HOST_INTERESTS=$(bashio::config 'host_interests')
|
||||
export HOST_LOOKING_FOR=$(bashio::config 'host_looking_for')
|
||||
|
||||
export GITHUB_TOKEN=$(bashio::config 'github_token')
|
||||
export GROQ_API_KEY=$(bashio::config 'groq_api_key')
|
||||
|
||||
export MASTODON_TOKEN=$(bashio::config 'mastodon_token')
|
||||
export MASTODON_INSTANCE=$(bashio::config 'mastodon_instance')
|
||||
|
||||
export DISCORD_BOT_TOKEN=$(bashio::config 'discord_bot_token')
|
||||
export DISCORD_TARGET_SERVERS=$(bashio::config 'discord_target_servers')
|
||||
|
||||
export LEMMY_INSTANCE=$(bashio::config 'lemmy_instance')
|
||||
export LEMMY_USERNAME=$(bashio::config 'lemmy_username')
|
||||
export LEMMY_PASSWORD=$(bashio::config 'lemmy_password')
|
||||
|
||||
export SMTP_HOST=$(bashio::config 'smtp_host')
|
||||
export SMTP_PORT=$(bashio::config 'smtp_port')
|
||||
export SMTP_USER=$(bashio::config 'smtp_user')
|
||||
export SMTP_PASS=$(bashio::config 'smtp_pass')
|
||||
|
||||
# set data paths
|
||||
export DB_PATH=/data/db/connectd.db
|
||||
export CACHE_DIR=/data/cache
|
||||
|
||||
bashio::log.info "starting connectd daemon..."
|
||||
bashio::log.info "HOST_USER: ${HOST_USER}"
|
||||
|
||||
cd /app
|
||||
exec python3 daemon.py
|
||||
|
|
@ -1,29 +0,0 @@
|
|||
"""
|
||||
scoutd - discovery module
|
||||
finds humans across platforms
|
||||
"""
|
||||
|
||||
from .github import scrape_github, get_github_user
|
||||
from .reddit import scrape_reddit
|
||||
from .mastodon import scrape_mastodon
|
||||
from .lobsters import scrape_lobsters
|
||||
from .matrix import scrape_matrix
|
||||
from .twitter import scrape_twitter
|
||||
from .bluesky import scrape_bluesky
|
||||
from .lemmy import scrape_lemmy
|
||||
from .discord import scrape_discord, send_discord_dm
|
||||
from .deep import (
|
||||
deep_scrape_github_user, check_already_connected, save_deep_profile,
|
||||
determine_contact_method, get_cached_orgs, cache_orgs,
|
||||
get_emails_from_commit_history, scrape_website_for_emails,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
'scrape_github', 'scrape_reddit', 'scrape_mastodon', 'scrape_lobsters',
|
||||
'scrape_matrix', 'scrape_twitter', 'scrape_bluesky', 'scrape_lemmy',
|
||||
'scrape_discord', 'send_discord_dm',
|
||||
'get_github_user', 'deep_scrape_github_user',
|
||||
'check_already_connected', 'save_deep_profile', 'determine_contact_method',
|
||||
'get_cached_orgs', 'cache_orgs', 'get_emails_from_commit_history',
|
||||
'scrape_website_for_emails',
|
||||
]
|
||||
|
|
@ -1,216 +0,0 @@
|
|||
"""
|
||||
scoutd/bluesky.py - bluesky/atproto discovery
|
||||
|
||||
bluesky has an open API via AT Protocol - no auth needed for public data
|
||||
many twitter refugees landed here, good source for aligned builders
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from .signals import analyze_text
|
||||
|
||||
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'bluesky'
|
||||
|
||||
# public bluesky API
|
||||
BSKY_API = 'https://public.api.bsky.app'
|
||||
|
||||
# hashtags to search
|
||||
ALIGNED_HASHTAGS = [
|
||||
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
|
||||
'privacy', 'solarpunk', 'cooperative', 'mutualaid', 'localfirst',
|
||||
'indieweb', 'smallweb', 'permacomputing', 'techworkers', 'coops',
|
||||
]
|
||||
|
||||
|
||||
def _api_get(endpoint, params=None):
|
||||
"""rate-limited API request with caching"""
|
||||
url = f"{BSKY_API}{endpoint}"
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(0.5) # rate limit
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" bluesky api error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def search_posts(query, limit=50):
|
||||
"""search for posts containing query"""
|
||||
result = _api_get('/xrpc/app.bsky.feed.searchPosts', {
|
||||
'q': query,
|
||||
'limit': min(limit, 100),
|
||||
})
|
||||
|
||||
if not result:
|
||||
return []
|
||||
|
||||
posts = result.get('posts', [])
|
||||
return posts
|
||||
|
||||
|
||||
def get_profile(handle):
|
||||
"""get user profile by handle (e.g., user.bsky.social)"""
|
||||
result = _api_get('/xrpc/app.bsky.actor.getProfile', {'actor': handle})
|
||||
return result
|
||||
|
||||
|
||||
def get_author_feed(handle, limit=30):
|
||||
"""get user's recent posts"""
|
||||
result = _api_get('/xrpc/app.bsky.feed.getAuthorFeed', {
|
||||
'actor': handle,
|
||||
'limit': limit,
|
||||
})
|
||||
|
||||
if not result:
|
||||
return []
|
||||
|
||||
return result.get('feed', [])
|
||||
|
||||
|
||||
def analyze_bluesky_user(handle):
|
||||
"""analyze a bluesky user for alignment"""
|
||||
profile = get_profile(handle)
|
||||
if not profile:
|
||||
return None
|
||||
|
||||
# collect text
|
||||
text_parts = []
|
||||
|
||||
# bio/description
|
||||
description = profile.get('description', '')
|
||||
if description:
|
||||
text_parts.append(description)
|
||||
|
||||
display_name = profile.get('displayName', '')
|
||||
if display_name:
|
||||
text_parts.append(display_name)
|
||||
|
||||
# recent posts
|
||||
feed = get_author_feed(handle, limit=20)
|
||||
for item in feed:
|
||||
post = item.get('post', {})
|
||||
record = post.get('record', {})
|
||||
text = record.get('text', '')
|
||||
if text:
|
||||
text_parts.append(text)
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# bluesky bonus (decentralized, values-aligned platform choice)
|
||||
platform_bonus = 10
|
||||
total_score = text_score + platform_bonus
|
||||
|
||||
# activity bonus
|
||||
followers = profile.get('followersCount', 0)
|
||||
posts_count = profile.get('postsCount', 0)
|
||||
|
||||
if posts_count >= 100:
|
||||
total_score += 5
|
||||
if followers >= 100:
|
||||
total_score += 5
|
||||
|
||||
# confidence
|
||||
confidence = 0.35 # base for bluesky (better signal than twitter)
|
||||
if len(text_parts) > 5:
|
||||
confidence += 0.2
|
||||
if len(positive_signals) >= 3:
|
||||
confidence += 0.2
|
||||
if posts_count >= 50:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 0.85)
|
||||
|
||||
reasons = ['on bluesky (atproto)']
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
|
||||
return {
|
||||
'platform': 'bluesky',
|
||||
'username': handle,
|
||||
'url': f"https://bsky.app/profile/{handle}",
|
||||
'name': display_name or handle,
|
||||
'bio': description,
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'followers': followers,
|
||||
'posts_count': posts_count,
|
||||
'reasons': reasons,
|
||||
'contact': {
|
||||
'bluesky': handle,
|
||||
},
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
}
|
||||
|
||||
|
||||
def scrape_bluesky(db, limit_per_hashtag=30):
|
||||
"""full bluesky scrape"""
|
||||
print("scoutd/bluesky: starting scrape...")
|
||||
|
||||
all_users = {}
|
||||
|
||||
for hashtag in ALIGNED_HASHTAGS:
|
||||
print(f" #{hashtag}...")
|
||||
|
||||
# search for hashtag
|
||||
posts = search_posts(f"#{hashtag}", limit=limit_per_hashtag)
|
||||
|
||||
for post in posts:
|
||||
author = post.get('author', {})
|
||||
handle = author.get('handle')
|
||||
|
||||
if handle and handle not in all_users:
|
||||
all_users[handle] = {
|
||||
'handle': handle,
|
||||
'display_name': author.get('displayName'),
|
||||
'hashtags': [hashtag],
|
||||
}
|
||||
elif handle:
|
||||
all_users[handle]['hashtags'].append(hashtag)
|
||||
|
||||
print(f" found {len(posts)} posts")
|
||||
|
||||
# prioritize users in multiple hashtags
|
||||
multi_hashtag = {h: d for h, d in all_users.items() if len(d.get('hashtags', [])) >= 2}
|
||||
print(f" {len(multi_hashtag)} users in 2+ aligned hashtags")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
for handle in list(multi_hashtag.keys())[:100]:
|
||||
try:
|
||||
result = analyze_bluesky_user(handle)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
if result['score'] >= 30:
|
||||
print(f" ★ @{handle}: {result['score']} pts")
|
||||
except Exception as e:
|
||||
print(f" error on {handle}: {e}")
|
||||
|
||||
print(f"scoutd/bluesky: found {len(results)} aligned humans")
|
||||
return results
|
||||
|
|
@ -1,323 +0,0 @@
|
|||
"""
|
||||
scoutd/discord.py - discord discovery
|
||||
|
||||
discord requires a bot token to read messages.
|
||||
target servers: programming help, career transition, indie hackers, etc.
|
||||
|
||||
SETUP:
|
||||
1. create discord app at discord.com/developers
|
||||
2. add bot, get token
|
||||
3. join target servers with bot
|
||||
4. set DISCORD_BOT_TOKEN env var
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import os
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from .signals import analyze_text
|
||||
from .lost import (
|
||||
analyze_social_for_lost_signals,
|
||||
classify_user,
|
||||
)
|
||||
|
||||
DISCORD_BOT_TOKEN = os.environ.get('DISCORD_BOT_TOKEN', '')
|
||||
DISCORD_API = 'https://discord.com/api/v10'
|
||||
|
||||
# default server IDs - values-aligned communities
|
||||
# bot must be invited to these servers to scout them
|
||||
# invite links for reference (use numeric IDs below):
|
||||
# - self-hosted: discord.gg/self-hosted
|
||||
# - foss-dev: discord.gg/foss-developers-group
|
||||
# - grapheneos: discord.gg/grapheneos
|
||||
# - queer-coded: discord.me/queer-coded
|
||||
# - homelab: discord.gg/homelab
|
||||
# - esphome: discord.gg/n9sdw7pnsn
|
||||
# - home-assistant: discord.gg/home-assistant
|
||||
# - linuxserver: discord.gg/linuxserver
|
||||
# - proxmox-scripts: discord.gg/jsYVk5JBxq
|
||||
DEFAULT_SERVERS = [
|
||||
# self-hosted / foss / privacy
|
||||
'693469700109369394', # self-hosted (selfhosted.show)
|
||||
'920089648842293248', # foss developers group
|
||||
'1176414688112820234', # grapheneos
|
||||
|
||||
# queer tech
|
||||
'925804557001437184', # queer coded
|
||||
|
||||
# home automation / homelab
|
||||
# note: these are large servers, bot needs to be invited
|
||||
# '330944238910963714', # home assistant (150k+ members)
|
||||
# '429907082951524364', # esphome (35k members)
|
||||
# '478094546522079232', # homelab (35k members)
|
||||
# '354974912613449730', # linuxserver.io (41k members)
|
||||
]
|
||||
|
||||
# merge env var servers with defaults
|
||||
_env_servers = os.environ.get('DISCORD_TARGET_SERVERS', '').split(',')
|
||||
_env_servers = [s.strip() for s in _env_servers if s.strip()]
|
||||
TARGET_SERVERS = list(set(DEFAULT_SERVERS + _env_servers))
|
||||
|
||||
# channels to focus on (keywords in channel name)
|
||||
TARGET_CHANNEL_KEYWORDS = [
|
||||
'help', 'career', 'jobs', 'learning', 'beginner',
|
||||
'general', 'introductions', 'showcase', 'projects',
|
||||
]
|
||||
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'discord'
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
def get_headers():
|
||||
"""get discord api headers"""
|
||||
if not DISCORD_BOT_TOKEN:
|
||||
return None
|
||||
return {
|
||||
'Authorization': f'Bot {DISCORD_BOT_TOKEN}',
|
||||
'Content-Type': 'application/json',
|
||||
}
|
||||
|
||||
|
||||
def get_guild_channels(guild_id):
|
||||
"""get channels in a guild"""
|
||||
headers = get_headers()
|
||||
if not headers:
|
||||
return []
|
||||
|
||||
try:
|
||||
resp = requests.get(
|
||||
f'{DISCORD_API}/guilds/{guild_id}/channels',
|
||||
headers=headers,
|
||||
timeout=30
|
||||
)
|
||||
if resp.status_code == 200:
|
||||
return resp.json()
|
||||
return []
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
|
||||
def get_channel_messages(channel_id, limit=100):
|
||||
"""get recent messages from a channel"""
|
||||
headers = get_headers()
|
||||
if not headers:
|
||||
return []
|
||||
|
||||
try:
|
||||
resp = requests.get(
|
||||
f'{DISCORD_API}/channels/{channel_id}/messages',
|
||||
headers=headers,
|
||||
params={'limit': limit},
|
||||
timeout=30
|
||||
)
|
||||
if resp.status_code == 200:
|
||||
return resp.json()
|
||||
return []
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
|
||||
def get_user_info(user_id):
|
||||
"""get discord user info"""
|
||||
headers = get_headers()
|
||||
if not headers:
|
||||
return None
|
||||
|
||||
try:
|
||||
resp = requests.get(
|
||||
f'{DISCORD_API}/users/{user_id}',
|
||||
headers=headers,
|
||||
timeout=30
|
||||
)
|
||||
if resp.status_code == 200:
|
||||
return resp.json()
|
||||
return None
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def analyze_discord_user(user_data, messages=None):
|
||||
"""analyze a discord user for values alignment and lost signals"""
|
||||
username = user_data.get('username', '')
|
||||
display_name = user_data.get('global_name') or username
|
||||
user_id = user_data.get('id')
|
||||
|
||||
# analyze messages
|
||||
all_signals = []
|
||||
all_text = []
|
||||
total_score = 0
|
||||
|
||||
if messages:
|
||||
for msg in messages[:20]:
|
||||
content = msg.get('content', '')
|
||||
if not content or len(content) < 20:
|
||||
continue
|
||||
|
||||
all_text.append(content)
|
||||
score, signals, _ = analyze_text(content)
|
||||
all_signals.extend(signals)
|
||||
total_score += score
|
||||
|
||||
all_signals = list(set(all_signals))
|
||||
|
||||
# lost builder detection
|
||||
profile_for_lost = {
|
||||
'bio': '',
|
||||
'message_count': len(messages) if messages else 0,
|
||||
}
|
||||
posts_for_lost = [{'text': t} for t in all_text]
|
||||
|
||||
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
|
||||
lost_potential_score = lost_weight
|
||||
user_type = classify_user(lost_potential_score, 50, total_score)
|
||||
|
||||
return {
|
||||
'platform': 'discord',
|
||||
'username': username,
|
||||
'url': f"https://discord.com/users/{user_id}",
|
||||
'name': display_name,
|
||||
'bio': '',
|
||||
'location': None,
|
||||
'score': total_score,
|
||||
'confidence': min(0.8, 0.2 + len(all_signals) * 0.1),
|
||||
'signals': all_signals,
|
||||
'negative_signals': [],
|
||||
'reasons': [],
|
||||
'contact': {'discord': f"{username}#{user_data.get('discriminator', '0')}"},
|
||||
'extra': {
|
||||
'user_id': user_id,
|
||||
'message_count': len(messages) if messages else 0,
|
||||
},
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type,
|
||||
}
|
||||
|
||||
|
||||
def scrape_discord(db, limit_per_channel=50):
|
||||
"""scrape discord servers for aligned builders"""
|
||||
if not DISCORD_BOT_TOKEN:
|
||||
print("discord: DISCORD_BOT_TOKEN not set, skipping")
|
||||
return 0
|
||||
|
||||
if not TARGET_SERVERS or TARGET_SERVERS == ['']:
|
||||
print("discord: DISCORD_TARGET_SERVERS not set, skipping")
|
||||
return 0
|
||||
|
||||
print("scouting discord...")
|
||||
|
||||
found = 0
|
||||
lost_found = 0
|
||||
seen_users = set()
|
||||
|
||||
for guild_id in TARGET_SERVERS:
|
||||
if not guild_id:
|
||||
continue
|
||||
|
||||
guild_id = guild_id.strip()
|
||||
channels = get_guild_channels(guild_id)
|
||||
|
||||
if not channels:
|
||||
print(f" guild {guild_id}: no access or no channels")
|
||||
continue
|
||||
|
||||
# filter to relevant channels
|
||||
target_channels = []
|
||||
for ch in channels:
|
||||
if ch.get('type') != 0: # text channels only
|
||||
continue
|
||||
name = ch.get('name', '').lower()
|
||||
if any(kw in name for kw in TARGET_CHANNEL_KEYWORDS):
|
||||
target_channels.append(ch)
|
||||
|
||||
print(f" guild {guild_id}: {len(target_channels)} relevant channels")
|
||||
|
||||
for channel in target_channels[:5]: # limit channels per server
|
||||
messages = get_channel_messages(channel['id'], limit=limit_per_channel)
|
||||
|
||||
if not messages:
|
||||
continue
|
||||
|
||||
# group messages by user
|
||||
user_messages = {}
|
||||
for msg in messages:
|
||||
author = msg.get('author', {})
|
||||
if author.get('bot'):
|
||||
continue
|
||||
|
||||
user_id = author.get('id')
|
||||
if not user_id or user_id in seen_users:
|
||||
continue
|
||||
|
||||
if user_id not in user_messages:
|
||||
user_messages[user_id] = {'user': author, 'messages': []}
|
||||
user_messages[user_id]['messages'].append(msg)
|
||||
|
||||
# analyze each user
|
||||
for user_id, data in user_messages.items():
|
||||
if user_id in seen_users:
|
||||
continue
|
||||
seen_users.add(user_id)
|
||||
|
||||
result = analyze_discord_user(data['user'], data['messages'])
|
||||
if not result:
|
||||
continue
|
||||
|
||||
if result['score'] >= 20 or result.get('lost_potential_score', 0) >= 30:
|
||||
db.save_human(result)
|
||||
found += 1
|
||||
|
||||
if result.get('user_type') in ['lost', 'both']:
|
||||
lost_found += 1
|
||||
|
||||
time.sleep(1) # rate limit between channels
|
||||
|
||||
time.sleep(2) # between guilds
|
||||
|
||||
print(f"discord: found {found} humans ({lost_found} lost builders)")
|
||||
return found
|
||||
|
||||
|
||||
def send_discord_dm(user_id, message, dry_run=False):
|
||||
"""send a DM to a discord user"""
|
||||
if not DISCORD_BOT_TOKEN:
|
||||
return False, "DISCORD_BOT_TOKEN not set"
|
||||
|
||||
if dry_run:
|
||||
print(f" [dry run] would DM discord user {user_id}")
|
||||
return True, "dry run"
|
||||
|
||||
headers = get_headers()
|
||||
|
||||
try:
|
||||
# create DM channel
|
||||
dm_resp = requests.post(
|
||||
f'{DISCORD_API}/users/@me/channels',
|
||||
headers=headers,
|
||||
json={'recipient_id': user_id},
|
||||
timeout=30
|
||||
)
|
||||
|
||||
if dm_resp.status_code not in [200, 201]:
|
||||
return False, f"couldn't create DM channel: {dm_resp.status_code}"
|
||||
|
||||
channel_id = dm_resp.json().get('id')
|
||||
|
||||
# send message
|
||||
msg_resp = requests.post(
|
||||
f'{DISCORD_API}/channels/{channel_id}/messages',
|
||||
headers=headers,
|
||||
json={'content': message},
|
||||
timeout=30
|
||||
)
|
||||
|
||||
if msg_resp.status_code in [200, 201]:
|
||||
return True, f"sent to {user_id}"
|
||||
else:
|
||||
return False, f"send failed: {msg_resp.status_code}"
|
||||
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
|
@ -1,330 +0,0 @@
|
|||
"""
|
||||
scoutd/github.py - github discovery
|
||||
scrapes repos, bios, commit patterns to find aligned builders
|
||||
also detects lost builders - people with potential who haven't started yet
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import os
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
||||
from .signals import analyze_text, TARGET_TOPICS, ECOSYSTEM_REPOS
|
||||
from .lost import (
|
||||
analyze_github_for_lost_signals,
|
||||
analyze_text_for_lost_signals,
|
||||
classify_user,
|
||||
get_signal_descriptions,
|
||||
)
|
||||
from .handles import discover_all_handles
|
||||
|
||||
# rate limit: 60/hr unauthenticated, 5000/hr with token
|
||||
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', '')
|
||||
HEADERS = {'Accept': 'application/vnd.github.v3+json'}
|
||||
if GITHUB_TOKEN:
|
||||
HEADERS['Authorization'] = f'token {GITHUB_TOKEN}'
|
||||
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'github'
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited api request with caching"""
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# check cache (1 hour expiry)
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
# rate limit
|
||||
time.sleep(0.5 if GITHUB_TOKEN else 2)
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
|
||||
# cache
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" github api error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def search_repos_by_topic(topic, per_page=100):
|
||||
"""search repos by topic tag"""
|
||||
url = 'https://api.github.com/search/repositories'
|
||||
params = {'q': f'topic:{topic}', 'sort': 'stars', 'order': 'desc', 'per_page': per_page}
|
||||
data = _api_get(url, params)
|
||||
return data.get('items', []) if data else []
|
||||
|
||||
|
||||
def get_repo_contributors(repo_full_name, per_page=100):
|
||||
"""get top contributors to a repo"""
|
||||
url = f'https://api.github.com/repos/{repo_full_name}/contributors'
|
||||
return _api_get(url, {'per_page': per_page}) or []
|
||||
|
||||
|
||||
def get_github_user(login):
|
||||
"""get full user profile"""
|
||||
url = f'https://api.github.com/users/{login}'
|
||||
return _api_get(url)
|
||||
|
||||
|
||||
def get_user_repos(login, per_page=100):
|
||||
"""get user's repos"""
|
||||
url = f'https://api.github.com/users/{login}/repos'
|
||||
return _api_get(url, {'per_page': per_page, 'sort': 'pushed'}) or []
|
||||
|
||||
|
||||
def analyze_github_user(login):
|
||||
"""
|
||||
analyze a github user for values alignment
|
||||
returns dict with score, confidence, signals, contact info
|
||||
"""
|
||||
user = get_github_user(login)
|
||||
if not user:
|
||||
return None
|
||||
|
||||
repos = get_user_repos(login)
|
||||
|
||||
# collect text corpus
|
||||
text_parts = []
|
||||
if user.get('bio'):
|
||||
text_parts.append(user['bio'])
|
||||
if user.get('company'):
|
||||
text_parts.append(user['company'])
|
||||
if user.get('location'):
|
||||
text_parts.append(user['location'])
|
||||
|
||||
# analyze repos
|
||||
all_topics = []
|
||||
languages = defaultdict(int)
|
||||
total_stars = 0
|
||||
|
||||
for repo in repos:
|
||||
if repo.get('description'):
|
||||
text_parts.append(repo['description'])
|
||||
if repo.get('topics'):
|
||||
all_topics.extend(repo['topics'])
|
||||
if repo.get('language'):
|
||||
languages[repo['language']] += 1
|
||||
total_stars += repo.get('stargazers_count', 0)
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
|
||||
# analyze signals
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# topic alignment
|
||||
aligned_topics = set(all_topics) & set(TARGET_TOPICS)
|
||||
topic_score = len(aligned_topics) * 10
|
||||
|
||||
# builder score (repos indicate building, not just talking)
|
||||
builder_score = 0
|
||||
if len(repos) > 20:
|
||||
builder_score = 15
|
||||
elif len(repos) > 10:
|
||||
builder_score = 10
|
||||
elif len(repos) > 5:
|
||||
builder_score = 5
|
||||
|
||||
# hireable bonus
|
||||
hireable_score = 5 if user.get('hireable') else 0
|
||||
|
||||
# total score
|
||||
total_score = text_score + topic_score + builder_score + hireable_score
|
||||
|
||||
# === LOST BUILDER DETECTION ===
|
||||
# build profile dict for lost analysis
|
||||
profile_for_lost = {
|
||||
'bio': user.get('bio'),
|
||||
'repos': repos,
|
||||
'public_repos': user.get('public_repos', len(repos)),
|
||||
'followers': user.get('followers', 0),
|
||||
'following': user.get('following', 0),
|
||||
'extra': {
|
||||
'top_repos': repos[:10],
|
||||
},
|
||||
}
|
||||
|
||||
# analyze for lost signals
|
||||
lost_signals, lost_weight = analyze_github_for_lost_signals(profile_for_lost)
|
||||
|
||||
# also check text for lost language patterns
|
||||
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
|
||||
for sig in text_lost_signals:
|
||||
if sig not in lost_signals:
|
||||
lost_signals.append(sig)
|
||||
lost_weight += text_lost_weight
|
||||
|
||||
lost_potential_score = lost_weight
|
||||
|
||||
# classify: builder, lost, both, or none
|
||||
user_type = classify_user(lost_potential_score, builder_score, total_score)
|
||||
|
||||
# confidence based on data richness
|
||||
confidence = 0.3
|
||||
if user.get('bio'):
|
||||
confidence += 0.15
|
||||
if len(repos) > 5:
|
||||
confidence += 0.15
|
||||
if len(text_parts) > 5:
|
||||
confidence += 0.15
|
||||
if user.get('email') or user.get('blog') or user.get('twitter_username'):
|
||||
confidence += 0.15
|
||||
if total_stars > 100:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 1.0)
|
||||
|
||||
# build reasons
|
||||
reasons = []
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if aligned_topics:
|
||||
reasons.append(f"topics: {', '.join(list(aligned_topics)[:5])}")
|
||||
if builder_score > 0:
|
||||
reasons.append(f"builder ({len(repos)} repos)")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
|
||||
# add lost reasons if applicable
|
||||
if user_type == 'lost' or user_type == 'both':
|
||||
lost_descriptions = get_signal_descriptions(lost_signals)
|
||||
if lost_descriptions:
|
||||
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
|
||||
|
||||
# === DEEP HANDLE DISCOVERY ===
|
||||
# follow blog links, scrape websites, find ALL social handles
|
||||
handles, discovered_emails = discover_all_handles(user)
|
||||
|
||||
# merge discovered emails with github email
|
||||
all_emails = discovered_emails or []
|
||||
if user.get('email'):
|
||||
all_emails.append(user['email'])
|
||||
all_emails = list(set(e for e in all_emails if e and 'noreply' not in e.lower()))
|
||||
|
||||
return {
|
||||
'platform': 'github',
|
||||
'username': login,
|
||||
'url': f"https://github.com/{login}",
|
||||
'name': user.get('name'),
|
||||
'bio': user.get('bio'),
|
||||
'location': user.get('location'),
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'topics': list(aligned_topics),
|
||||
'languages': dict(languages),
|
||||
'repo_count': len(repos),
|
||||
'total_stars': total_stars,
|
||||
'reasons': reasons,
|
||||
'contact': {
|
||||
'email': all_emails[0] if all_emails else None,
|
||||
'emails': all_emails,
|
||||
'blog': user.get('blog'),
|
||||
'twitter': user.get('twitter_username') or handles.get('twitter'),
|
||||
'mastodon': handles.get('mastodon'),
|
||||
'bluesky': handles.get('bluesky'),
|
||||
'matrix': handles.get('matrix'),
|
||||
'lemmy': handles.get('lemmy'),
|
||||
},
|
||||
'extra': {
|
||||
'topics': list(aligned_topics),
|
||||
'languages': dict(languages),
|
||||
'repo_count': len(repos),
|
||||
'total_stars': total_stars,
|
||||
'hireable': user.get('hireable', False),
|
||||
'handles': handles, # all discovered handles
|
||||
},
|
||||
'hireable': user.get('hireable', False),
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
# lost builder fields
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type, # 'builder', 'lost', 'both', 'none'
|
||||
}
|
||||
|
||||
|
||||
def scrape_github(db, limit_per_source=50):
|
||||
"""
|
||||
full github scrape
|
||||
returns list of analyzed users
|
||||
"""
|
||||
print("scoutd/github: starting scrape...")
|
||||
|
||||
all_logins = set()
|
||||
|
||||
# 1. ecosystem repo contributors
|
||||
print(" scraping ecosystem repo contributors...")
|
||||
for repo in ECOSYSTEM_REPOS:
|
||||
contributors = get_repo_contributors(repo, per_page=limit_per_source)
|
||||
for c in contributors:
|
||||
login = c.get('login')
|
||||
if login and not login.endswith('[bot]'):
|
||||
all_logins.add(login)
|
||||
print(f" {repo}: {len(contributors)} contributors")
|
||||
|
||||
# 2. topic repos
|
||||
print(" scraping topic repos...")
|
||||
for topic in TARGET_TOPICS[:10]:
|
||||
repos = search_repos_by_topic(topic, per_page=30)
|
||||
for repo in repos:
|
||||
owner = repo.get('owner', {}).get('login')
|
||||
if owner and not owner.endswith('[bot]'):
|
||||
all_logins.add(owner)
|
||||
print(f" #{topic}: {len(repos)} repos")
|
||||
|
||||
print(f" found {len(all_logins)} unique users to analyze")
|
||||
|
||||
# analyze each
|
||||
results = []
|
||||
builders_found = 0
|
||||
lost_found = 0
|
||||
|
||||
for i, login in enumerate(all_logins):
|
||||
if i % 20 == 0:
|
||||
print(f" analyzing... {i}/{len(all_logins)}")
|
||||
|
||||
try:
|
||||
result = analyze_github_user(login)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
user_type = result.get('user_type', 'none')
|
||||
|
||||
if user_type == 'builder':
|
||||
builders_found += 1
|
||||
if result['score'] >= 50:
|
||||
print(f" ★ {login}: {result['score']} pts, {result['confidence']:.0%} conf")
|
||||
|
||||
elif user_type == 'lost':
|
||||
lost_found += 1
|
||||
lost_score = result.get('lost_potential_score', 0)
|
||||
if lost_score >= 40:
|
||||
print(f" 💔 {login}: lost_score={lost_score}, values={result['score']} pts")
|
||||
|
||||
elif user_type == 'both':
|
||||
builders_found += 1
|
||||
lost_found += 1
|
||||
print(f" ⚡ {login}: recovering builder (lost={result.get('lost_potential_score', 0)}, active={result['score']})")
|
||||
|
||||
except Exception as e:
|
||||
print(f" error on {login}: {e}")
|
||||
|
||||
print(f"scoutd/github: found {len(results)} aligned humans")
|
||||
print(f" - {builders_found} active builders")
|
||||
print(f" - {lost_found} lost builders (need encouragement)")
|
||||
return results
|
||||
|
|
@ -1,507 +0,0 @@
|
|||
"""
|
||||
scoutd/handles.py - comprehensive social handle discovery
|
||||
|
||||
finds ALL social handles from:
|
||||
- github bio/profile
|
||||
- personal websites (rel="me", footers, contact pages, json-ld)
|
||||
- README files
|
||||
- linktree/bio.link/carrd pages
|
||||
- any linked pages
|
||||
|
||||
stores structured handle data for activity-based contact selection
|
||||
"""
|
||||
|
||||
import re
|
||||
import json
|
||||
import requests
|
||||
from urllib.parse import urlparse, urljoin
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
HEADERS = {'User-Agent': 'Mozilla/5.0 (compatible; connectd/1.0)'}
|
||||
|
||||
# platform URL patterns -> (platform, handle_extractor)
|
||||
PLATFORM_PATTERNS = {
|
||||
# fediverse
|
||||
'mastodon': [
|
||||
(r'https?://([^/]+)/@([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
|
||||
(r'https?://([^/]+)/users/([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
|
||||
(r'https?://mastodon\.social/@([^/?#]+)', lambda m: f"@{m.group(1)}@mastodon.social"),
|
||||
],
|
||||
'pixelfed': [
|
||||
(r'https?://pixelfed\.social/@([^/?#]+)', lambda m: f"@{m.group(1)}@pixelfed.social"),
|
||||
(r'https?://([^/]*pixelfed[^/]*)/@([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
|
||||
],
|
||||
'lemmy': [
|
||||
(r'https?://([^/]+)/u/([^/?#]+)', lambda m: f"@{m.group(2)}@{m.group(1)}"),
|
||||
(r'https?://lemmy\.([^/]+)/u/([^/?#]+)', lambda m: f"@{m.group(2)}@lemmy.{m.group(1)}"),
|
||||
],
|
||||
|
||||
# mainstream
|
||||
'twitter': [
|
||||
(r'https?://(?:www\.)?(?:twitter|x)\.com/([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
],
|
||||
'bluesky': [
|
||||
(r'https?://bsky\.app/profile/([^/?#]+)', lambda m: m.group(1)),
|
||||
(r'https?://([^.]+)\.bsky\.social', lambda m: f"{m.group(1)}.bsky.social"),
|
||||
],
|
||||
'threads': [
|
||||
(r'https?://(?:www\.)?threads\.net/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
],
|
||||
'instagram': [
|
||||
(r'https?://(?:www\.)?instagram\.com/([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
],
|
||||
'facebook': [
|
||||
(r'https?://(?:www\.)?facebook\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'linkedin': [
|
||||
(r'https?://(?:www\.)?linkedin\.com/in/([^/?#]+)', lambda m: m.group(1)),
|
||||
(r'https?://(?:www\.)?linkedin\.com/company/([^/?#]+)', lambda m: f"company/{m.group(1)}"),
|
||||
],
|
||||
|
||||
# dev platforms
|
||||
'github': [
|
||||
(r'https?://(?:www\.)?github\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'gitlab': [
|
||||
(r'https?://(?:www\.)?gitlab\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'codeberg': [
|
||||
(r'https?://codeberg\.org/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'sourcehut': [
|
||||
(r'https?://sr\.ht/~([^/?#]+)', lambda m: f"~{m.group(1)}"),
|
||||
(r'https?://git\.sr\.ht/~([^/?#]+)', lambda m: f"~{m.group(1)}"),
|
||||
],
|
||||
|
||||
# chat
|
||||
'matrix': [
|
||||
(r'https?://matrix\.to/#/(@[^:]+:[^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'discord': [
|
||||
(r'https?://discord\.gg/([^/?#]+)', lambda m: f"invite/{m.group(1)}"),
|
||||
(r'https?://discord\.com/invite/([^/?#]+)', lambda m: f"invite/{m.group(1)}"),
|
||||
],
|
||||
'telegram': [
|
||||
(r'https?://t\.me/([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
],
|
||||
|
||||
# content
|
||||
'youtube': [
|
||||
(r'https?://(?:www\.)?youtube\.com/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
(r'https?://(?:www\.)?youtube\.com/c(?:hannel)?/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'twitch': [
|
||||
(r'https?://(?:www\.)?twitch\.tv/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'substack': [
|
||||
(r'https?://([^.]+)\.substack\.com', lambda m: m.group(1)),
|
||||
],
|
||||
'medium': [
|
||||
(r'https?://(?:www\.)?medium\.com/@([^/?#]+)', lambda m: f"@{m.group(1)}"),
|
||||
(r'https?://([^.]+)\.medium\.com', lambda m: m.group(1)),
|
||||
],
|
||||
'devto': [
|
||||
(r'https?://dev\.to/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
|
||||
# funding
|
||||
'kofi': [
|
||||
(r'https?://ko-fi\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'patreon': [
|
||||
(r'https?://(?:www\.)?patreon\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'liberapay': [
|
||||
(r'https?://liberapay\.com/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'github_sponsors': [
|
||||
(r'https?://github\.com/sponsors/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
|
||||
# link aggregators (we'll parse these specially)
|
||||
'linktree': [
|
||||
(r'https?://linktr\.ee/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'biolink': [
|
||||
(r'https?://bio\.link/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
'carrd': [
|
||||
(r'https?://([^.]+)\.carrd\.co', lambda m: m.group(1)),
|
||||
],
|
||||
}
|
||||
|
||||
# fediverse handle pattern: @user@instance
|
||||
FEDIVERSE_HANDLE_PATTERN = re.compile(r'@([\w.-]+)@([\w.-]+\.[\w]+)')
|
||||
|
||||
# email pattern
|
||||
EMAIL_PATTERN = re.compile(r'\b([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})\b')
|
||||
|
||||
# known fediverse instances (for context-free handle detection)
|
||||
KNOWN_FEDIVERSE_INSTANCES = [
|
||||
'mastodon.social', 'mastodon.online', 'mstdn.social', 'mas.to',
|
||||
'tech.lgbt', 'fosstodon.org', 'hackers.town', 'social.coop',
|
||||
'kolektiva.social', 'solarpunk.moe', 'wandering.shop',
|
||||
'elekk.xyz', 'cybre.space', 'octodon.social', 'chaos.social',
|
||||
'infosec.exchange', 'ruby.social', 'phpc.social', 'toot.cafe',
|
||||
'mstdn.io', 'pixelfed.social', 'lemmy.ml', 'lemmy.world',
|
||||
'kbin.social', 'pleroma.site', 'akkoma.dev',
|
||||
]
|
||||
|
||||
|
||||
def extract_handle_from_url(url):
|
||||
"""extract platform and handle from a URL"""
|
||||
for platform, patterns in PLATFORM_PATTERNS.items():
|
||||
for pattern, extractor in patterns:
|
||||
match = re.match(pattern, url, re.I)
|
||||
if match:
|
||||
return platform, extractor(match)
|
||||
return None, None
|
||||
|
||||
|
||||
def extract_fediverse_handles(text):
|
||||
"""find @user@instance.tld patterns in text"""
|
||||
handles = []
|
||||
for match in FEDIVERSE_HANDLE_PATTERN.finditer(text):
|
||||
user, instance = match.groups()
|
||||
handles.append(f"@{user}@{instance}")
|
||||
return handles
|
||||
|
||||
|
||||
def extract_emails(text):
|
||||
"""find email addresses in text"""
|
||||
emails = []
|
||||
for match in EMAIL_PATTERN.finditer(text):
|
||||
email = match.group(1)
|
||||
# filter out common non-personal emails
|
||||
if not any(x in email.lower() for x in ['noreply', 'no-reply', 'donotreply', 'example.com']):
|
||||
emails.append(email)
|
||||
return emails
|
||||
|
||||
|
||||
def scrape_page(url, timeout=15):
|
||||
"""fetch and parse a web page"""
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, timeout=timeout, allow_redirects=True)
|
||||
resp.raise_for_status()
|
||||
return BeautifulSoup(resp.text, 'html.parser'), resp.text
|
||||
except Exception as e:
|
||||
return None, None
|
||||
|
||||
|
||||
def extract_rel_me_links(soup):
|
||||
"""extract rel="me" links (used for verification)"""
|
||||
links = []
|
||||
if not soup:
|
||||
return links
|
||||
|
||||
for a in soup.find_all('a', rel=lambda x: x and 'me' in x):
|
||||
href = a.get('href')
|
||||
if href:
|
||||
links.append(href)
|
||||
|
||||
return links
|
||||
|
||||
|
||||
def extract_social_links_from_page(soup, base_url=None):
|
||||
"""extract all social links from a page"""
|
||||
links = []
|
||||
if not soup:
|
||||
return links
|
||||
|
||||
# all links
|
||||
for a in soup.find_all('a', href=True):
|
||||
href = a['href']
|
||||
if base_url and not href.startswith('http'):
|
||||
href = urljoin(base_url, href)
|
||||
|
||||
# check if it's a known social platform
|
||||
platform, handle = extract_handle_from_url(href)
|
||||
if platform:
|
||||
links.append({'platform': platform, 'handle': handle, 'url': href})
|
||||
|
||||
return links
|
||||
|
||||
|
||||
def extract_json_ld(soup):
|
||||
"""extract structured data from JSON-LD"""
|
||||
data = {}
|
||||
if not soup:
|
||||
return data
|
||||
|
||||
for script in soup.find_all('script', type='application/ld+json'):
|
||||
try:
|
||||
ld = json.loads(script.string)
|
||||
# look for sameAs links (social profiles)
|
||||
if isinstance(ld, dict):
|
||||
same_as = ld.get('sameAs', [])
|
||||
if isinstance(same_as, str):
|
||||
same_as = [same_as]
|
||||
for url in same_as:
|
||||
platform, handle = extract_handle_from_url(url)
|
||||
if platform:
|
||||
data[platform] = handle
|
||||
except:
|
||||
pass
|
||||
|
||||
return data
|
||||
|
||||
|
||||
def scrape_linktree(url):
|
||||
"""scrape a linktree/bio.link/carrd page for all links"""
|
||||
handles = {}
|
||||
soup, raw = scrape_page(url)
|
||||
if not soup:
|
||||
return handles
|
||||
|
||||
# linktree uses data attributes and JS, but links are often in the HTML
|
||||
links = extract_social_links_from_page(soup, url)
|
||||
for link in links:
|
||||
if link['platform'] not in ['linktree', 'biolink', 'carrd']:
|
||||
handles[link['platform']] = link['handle']
|
||||
|
||||
# also check for fediverse handles in text
|
||||
if raw:
|
||||
fedi_handles = extract_fediverse_handles(raw)
|
||||
if fedi_handles:
|
||||
handles['mastodon'] = fedi_handles[0]
|
||||
|
||||
return handles
|
||||
|
||||
|
||||
def scrape_website_for_handles(url, follow_links=True):
|
||||
"""
|
||||
comprehensive website scrape for social handles
|
||||
|
||||
checks:
|
||||
- rel="me" links
|
||||
- social links in page
|
||||
- json-ld structured data
|
||||
- /about and /contact pages
|
||||
- fediverse handles in text
|
||||
- emails
|
||||
"""
|
||||
handles = {}
|
||||
emails = []
|
||||
|
||||
soup, raw = scrape_page(url)
|
||||
if not soup:
|
||||
return handles, emails
|
||||
|
||||
# 1. rel="me" links (most authoritative)
|
||||
rel_me = extract_rel_me_links(soup)
|
||||
for link in rel_me:
|
||||
platform, handle = extract_handle_from_url(link)
|
||||
if platform and platform not in handles:
|
||||
handles[platform] = handle
|
||||
|
||||
# 2. all social links on page
|
||||
social_links = extract_social_links_from_page(soup, url)
|
||||
for link in social_links:
|
||||
if link['platform'] not in handles:
|
||||
handles[link['platform']] = link['handle']
|
||||
|
||||
# 3. json-ld structured data
|
||||
json_ld = extract_json_ld(soup)
|
||||
for platform, handle in json_ld.items():
|
||||
if platform not in handles:
|
||||
handles[platform] = handle
|
||||
|
||||
# 4. fediverse handles in text
|
||||
if raw:
|
||||
fedi = extract_fediverse_handles(raw)
|
||||
if fedi and 'mastodon' not in handles:
|
||||
handles['mastodon'] = fedi[0]
|
||||
|
||||
# emails
|
||||
emails = extract_emails(raw)
|
||||
|
||||
# 5. follow links to /about, /contact
|
||||
if follow_links:
|
||||
parsed = urlparse(url)
|
||||
base = f"{parsed.scheme}://{parsed.netloc}"
|
||||
|
||||
for path in ['/about', '/contact', '/links', '/social']:
|
||||
try:
|
||||
sub_soup, sub_raw = scrape_page(base + path)
|
||||
if sub_soup:
|
||||
sub_links = extract_social_links_from_page(sub_soup, base)
|
||||
for link in sub_links:
|
||||
if link['platform'] not in handles:
|
||||
handles[link['platform']] = link['handle']
|
||||
|
||||
if sub_raw:
|
||||
fedi = extract_fediverse_handles(sub_raw)
|
||||
if fedi and 'mastodon' not in handles:
|
||||
handles['mastodon'] = fedi[0]
|
||||
|
||||
emails.extend(extract_emails(sub_raw))
|
||||
except:
|
||||
pass
|
||||
|
||||
# 6. check for linktree etc in links and follow them
|
||||
for platform in ['linktree', 'biolink', 'carrd']:
|
||||
if platform in handles:
|
||||
# this is actually a link aggregator, scrape it
|
||||
link_url = None
|
||||
for link in social_links:
|
||||
if link['platform'] == platform:
|
||||
link_url = link['url']
|
||||
break
|
||||
|
||||
if link_url:
|
||||
aggregator_handles = scrape_linktree(link_url)
|
||||
for p, h in aggregator_handles.items():
|
||||
if p not in handles:
|
||||
handles[p] = h
|
||||
|
||||
del handles[platform] # remove the aggregator itself
|
||||
|
||||
return handles, list(set(emails))
|
||||
|
||||
|
||||
def extract_handles_from_text(text):
|
||||
"""extract handles from plain text (bio, README, etc)"""
|
||||
handles = {}
|
||||
|
||||
if not text:
|
||||
return handles
|
||||
|
||||
# fediverse handles
|
||||
fedi = extract_fediverse_handles(text)
|
||||
if fedi:
|
||||
handles['mastodon'] = fedi[0]
|
||||
|
||||
# URL patterns in text
|
||||
url_pattern = re.compile(r'https?://[^\s<>"\']+')
|
||||
for match in url_pattern.finditer(text):
|
||||
url = match.group(0).rstrip('.,;:!?)')
|
||||
platform, handle = extract_handle_from_url(url)
|
||||
if platform and platform not in handles:
|
||||
handles[platform] = handle
|
||||
|
||||
# twitter-style @mentions (only if looks like twitter context)
|
||||
if 'twitter' in text.lower() or 'x.com' in text.lower():
|
||||
twitter_pattern = re.compile(r'(?:^|[^\w])@(\w{1,15})(?:[^\w]|$)')
|
||||
for match in twitter_pattern.finditer(text):
|
||||
if 'twitter' not in handles:
|
||||
handles['twitter'] = f"@{match.group(1)}"
|
||||
|
||||
# matrix handles
|
||||
matrix_pattern = re.compile(r'@([\w.-]+):([\w.-]+)')
|
||||
for match in matrix_pattern.finditer(text):
|
||||
if 'matrix' not in handles:
|
||||
handles['matrix'] = f"@{match.group(1)}:{match.group(2)}"
|
||||
|
||||
return handles
|
||||
|
||||
|
||||
def scrape_github_readme(username):
|
||||
"""scrape user's profile README (username/username repo)"""
|
||||
handles = {}
|
||||
emails = []
|
||||
|
||||
url = f"https://raw.githubusercontent.com/{username}/{username}/main/README.md"
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, timeout=10)
|
||||
if resp.status_code == 200:
|
||||
text = resp.text
|
||||
|
||||
# extract handles from text
|
||||
handles = extract_handles_from_text(text)
|
||||
|
||||
# extract emails
|
||||
emails = extract_emails(text)
|
||||
|
||||
return handles, emails
|
||||
except:
|
||||
pass
|
||||
|
||||
# try master branch
|
||||
url = f"https://raw.githubusercontent.com/{username}/{username}/master/README.md"
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, timeout=10)
|
||||
if resp.status_code == 200:
|
||||
text = resp.text
|
||||
handles = extract_handles_from_text(text)
|
||||
emails = extract_emails(text)
|
||||
except:
|
||||
pass
|
||||
|
||||
return handles, emails
|
||||
|
||||
|
||||
def discover_all_handles(github_profile):
|
||||
"""
|
||||
comprehensive handle discovery from a github profile dict
|
||||
|
||||
github_profile should contain:
|
||||
- username
|
||||
- bio
|
||||
- blog (website URL)
|
||||
- twitter_username
|
||||
- etc.
|
||||
"""
|
||||
handles = {}
|
||||
emails = []
|
||||
|
||||
username = github_profile.get('login') or github_profile.get('username')
|
||||
|
||||
print(f" discovering handles for {username}...")
|
||||
|
||||
# 1. github bio
|
||||
bio = github_profile.get('bio', '')
|
||||
if bio:
|
||||
bio_handles = extract_handles_from_text(bio)
|
||||
handles.update(bio_handles)
|
||||
emails.extend(extract_emails(bio))
|
||||
|
||||
# 2. twitter from github profile
|
||||
twitter = github_profile.get('twitter_username')
|
||||
if twitter and 'twitter' not in handles:
|
||||
handles['twitter'] = f"@{twitter}"
|
||||
|
||||
# 3. website from github profile
|
||||
website = github_profile.get('blog')
|
||||
if website:
|
||||
if not website.startswith('http'):
|
||||
website = f"https://{website}"
|
||||
|
||||
print(f" scraping website: {website}")
|
||||
site_handles, site_emails = scrape_website_for_handles(website)
|
||||
for p, h in site_handles.items():
|
||||
if p not in handles:
|
||||
handles[p] = h
|
||||
emails.extend(site_emails)
|
||||
|
||||
# 4. profile README
|
||||
if username:
|
||||
print(f" checking profile README...")
|
||||
readme_handles, readme_emails = scrape_github_readme(username)
|
||||
for p, h in readme_handles.items():
|
||||
if p not in handles:
|
||||
handles[p] = h
|
||||
emails.extend(readme_emails)
|
||||
|
||||
# 5. email from github profile
|
||||
github_email = github_profile.get('email')
|
||||
if github_email:
|
||||
emails.append(github_email)
|
||||
|
||||
# dedupe emails
|
||||
emails = list(set(e for e in emails if e and '@' in e and 'noreply' not in e.lower()))
|
||||
|
||||
print(f" found {len(handles)} handles, {len(emails)} emails")
|
||||
|
||||
return handles, emails
|
||||
|
||||
|
||||
def merge_handles(existing, new):
|
||||
"""merge new handles into existing, preferring more specific handles"""
|
||||
for platform, handle in new.items():
|
||||
if platform not in existing:
|
||||
existing[platform] = handle
|
||||
elif len(handle) > len(existing[platform]):
|
||||
# prefer longer/more specific handles
|
||||
existing[platform] = handle
|
||||
|
||||
return existing
|
||||
|
|
@ -1,322 +0,0 @@
|
|||
"""
|
||||
scoutd/lemmy.py - lemmy (fediverse reddit) discovery
|
||||
|
||||
lemmy is federated so we hit multiple instances.
|
||||
great for finding lost builders in communities like:
|
||||
- /c/programming, /c/technology, /c/linux
|
||||
- /c/antiwork, /c/workreform (lost builders!)
|
||||
- /c/selfhosted, /c/privacy, /c/opensource
|
||||
|
||||
supports authenticated access for private instances and DM delivery.
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import os
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from .signals import analyze_text
|
||||
from .lost import (
|
||||
analyze_social_for_lost_signals,
|
||||
analyze_text_for_lost_signals,
|
||||
classify_user,
|
||||
)
|
||||
|
||||
# auth config from environment
|
||||
LEMMY_INSTANCE = os.environ.get('LEMMY_INSTANCE', '')
|
||||
LEMMY_USERNAME = os.environ.get('LEMMY_USERNAME', '')
|
||||
LEMMY_PASSWORD = os.environ.get('LEMMY_PASSWORD', '')
|
||||
|
||||
# auth token cache
|
||||
_auth_token = None
|
||||
|
||||
# popular lemmy instances
|
||||
LEMMY_INSTANCES = [
|
||||
'lemmy.ml',
|
||||
'lemmy.world',
|
||||
'programming.dev',
|
||||
'lemm.ee',
|
||||
'sh.itjust.works',
|
||||
]
|
||||
|
||||
# communities to scout (format: community@instance or just community for local)
|
||||
TARGET_COMMUNITIES = [
|
||||
# builder communities
|
||||
'programming',
|
||||
'selfhosted',
|
||||
'linux',
|
||||
'opensource',
|
||||
'privacy',
|
||||
'technology',
|
||||
'webdev',
|
||||
'rust',
|
||||
'python',
|
||||
'golang',
|
||||
|
||||
# lost builder communities (people struggling, stuck, seeking)
|
||||
'antiwork',
|
||||
'workreform',
|
||||
'careerguidance',
|
||||
'cscareerquestions',
|
||||
'learnprogramming',
|
||||
'findapath',
|
||||
]
|
||||
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'lemmy'
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
def get_auth_token(instance=None):
|
||||
"""get auth token for lemmy instance"""
|
||||
global _auth_token
|
||||
|
||||
if _auth_token:
|
||||
return _auth_token
|
||||
|
||||
instance = instance or LEMMY_INSTANCE
|
||||
if not all([instance, LEMMY_USERNAME, LEMMY_PASSWORD]):
|
||||
return None
|
||||
|
||||
try:
|
||||
url = f"https://{instance}/api/v3/user/login"
|
||||
resp = requests.post(url, json={
|
||||
'username_or_email': LEMMY_USERNAME,
|
||||
'password': LEMMY_PASSWORD,
|
||||
}, timeout=30)
|
||||
|
||||
if resp.status_code == 200:
|
||||
_auth_token = resp.json().get('jwt')
|
||||
return _auth_token
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f"lemmy auth error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def send_lemmy_dm(recipient_username, message, dry_run=False):
|
||||
"""send a private message via lemmy"""
|
||||
if not LEMMY_INSTANCE:
|
||||
return False, "LEMMY_INSTANCE not configured"
|
||||
|
||||
if dry_run:
|
||||
print(f"[dry run] would send lemmy DM to {recipient_username}")
|
||||
return True, None
|
||||
|
||||
token = get_auth_token()
|
||||
if not token:
|
||||
return False, "failed to authenticate with lemmy"
|
||||
|
||||
try:
|
||||
# parse recipient - could be username@instance or just username
|
||||
if '@' in recipient_username:
|
||||
username, instance = recipient_username.split('@', 1)
|
||||
else:
|
||||
username = recipient_username
|
||||
instance = LEMMY_INSTANCE
|
||||
|
||||
# get recipient user id
|
||||
user_url = f"https://{LEMMY_INSTANCE}/api/v3/user"
|
||||
resp = requests.get(user_url, params={'username': f"{username}@{instance}"}, timeout=30)
|
||||
|
||||
if resp.status_code != 200:
|
||||
# try without instance suffix for local users
|
||||
resp = requests.get(user_url, params={'username': username}, timeout=30)
|
||||
|
||||
if resp.status_code != 200:
|
||||
return False, f"could not find user {recipient_username}"
|
||||
|
||||
recipient_id = resp.json().get('person_view', {}).get('person', {}).get('id')
|
||||
if not recipient_id:
|
||||
return False, "could not get recipient id"
|
||||
|
||||
# send DM
|
||||
dm_url = f"https://{LEMMY_INSTANCE}/api/v3/private_message"
|
||||
resp = requests.post(dm_url,
|
||||
headers={'Authorization': f'Bearer {token}'},
|
||||
json={
|
||||
'content': message,
|
||||
'recipient_id': recipient_id,
|
||||
},
|
||||
timeout=30
|
||||
)
|
||||
|
||||
if resp.status_code == 200:
|
||||
return True, None
|
||||
else:
|
||||
return False, f"lemmy DM error: {resp.status_code} - {resp.text}"
|
||||
|
||||
except Exception as e:
|
||||
return False, f"lemmy DM error: {str(e)}"
|
||||
|
||||
|
||||
def get_community_posts(instance, community, limit=50, sort='New'):
|
||||
"""get posts from a lemmy community"""
|
||||
try:
|
||||
url = f"https://{instance}/api/v3/post/list"
|
||||
params = {
|
||||
'community_name': community,
|
||||
'sort': sort,
|
||||
'limit': limit,
|
||||
}
|
||||
|
||||
resp = requests.get(url, params=params, timeout=30)
|
||||
if resp.status_code == 200:
|
||||
return resp.json().get('posts', [])
|
||||
return []
|
||||
except Exception as e:
|
||||
return []
|
||||
|
||||
|
||||
def get_user_profile(instance, username):
|
||||
"""get lemmy user profile"""
|
||||
try:
|
||||
url = f"https://{instance}/api/v3/user"
|
||||
params = {'username': username}
|
||||
|
||||
resp = requests.get(url, params=params, timeout=30)
|
||||
if resp.status_code == 200:
|
||||
return resp.json()
|
||||
return None
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def analyze_lemmy_user(instance, username, posts=None):
|
||||
"""analyze a lemmy user for values alignment and lost signals"""
|
||||
profile = get_user_profile(instance, username)
|
||||
if not profile:
|
||||
return None
|
||||
|
||||
person = profile.get('person_view', {}).get('person', {})
|
||||
counts = profile.get('person_view', {}).get('counts', {})
|
||||
|
||||
bio = person.get('bio', '') or ''
|
||||
display_name = person.get('display_name') or person.get('name', username)
|
||||
|
||||
# analyze bio
|
||||
bio_score, bio_signals, bio_reasons = analyze_text(bio)
|
||||
|
||||
# analyze posts if provided
|
||||
post_signals = []
|
||||
post_text = []
|
||||
if posts:
|
||||
for post in posts[:10]:
|
||||
post_data = post.get('post', {})
|
||||
title = post_data.get('name', '')
|
||||
body = post_data.get('body', '')
|
||||
post_text.append(f"{title} {body}")
|
||||
|
||||
_, signals, _ = analyze_text(f"{title} {body}")
|
||||
post_signals.extend(signals)
|
||||
|
||||
all_signals = list(set(bio_signals + post_signals))
|
||||
total_score = bio_score + len(post_signals) * 5
|
||||
|
||||
# lost builder detection
|
||||
profile_for_lost = {
|
||||
'bio': bio,
|
||||
'post_count': counts.get('post_count', 0),
|
||||
'comment_count': counts.get('comment_count', 0),
|
||||
}
|
||||
posts_for_lost = [{'text': t} for t in post_text]
|
||||
|
||||
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
|
||||
lost_potential_score = lost_weight
|
||||
user_type = classify_user(lost_potential_score, 50, total_score)
|
||||
|
||||
return {
|
||||
'platform': 'lemmy',
|
||||
'username': f"{username}@{instance}",
|
||||
'url': f"https://{instance}/u/{username}",
|
||||
'name': display_name,
|
||||
'bio': bio,
|
||||
'location': None,
|
||||
'score': total_score,
|
||||
'confidence': min(0.9, 0.3 + len(all_signals) * 0.1),
|
||||
'signals': all_signals,
|
||||
'negative_signals': [],
|
||||
'reasons': bio_reasons,
|
||||
'contact': {},
|
||||
'extra': {
|
||||
'instance': instance,
|
||||
'post_count': counts.get('post_count', 0),
|
||||
'comment_count': counts.get('comment_count', 0),
|
||||
},
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type,
|
||||
}
|
||||
|
||||
|
||||
def scrape_lemmy(db, limit_per_community=30):
|
||||
"""scrape lemmy instances for aligned builders"""
|
||||
print("scouting lemmy...")
|
||||
|
||||
found = 0
|
||||
lost_found = 0
|
||||
seen_users = set()
|
||||
|
||||
# build instance list - user's instance first if configured
|
||||
instances = list(LEMMY_INSTANCES)
|
||||
if LEMMY_INSTANCE and LEMMY_INSTANCE not in instances:
|
||||
instances.insert(0, LEMMY_INSTANCE)
|
||||
|
||||
for instance in instances:
|
||||
print(f" instance: {instance}")
|
||||
|
||||
for community in TARGET_COMMUNITIES:
|
||||
posts = get_community_posts(instance, community, limit=limit_per_community)
|
||||
|
||||
if not posts:
|
||||
continue
|
||||
|
||||
print(f" /c/{community}: {len(posts)} posts")
|
||||
|
||||
# group posts by user
|
||||
user_posts = {}
|
||||
for post in posts:
|
||||
creator = post.get('creator', {})
|
||||
username = creator.get('name')
|
||||
if not username:
|
||||
continue
|
||||
|
||||
user_key = f"{username}@{instance}"
|
||||
if user_key in seen_users:
|
||||
continue
|
||||
|
||||
if user_key not in user_posts:
|
||||
user_posts[user_key] = []
|
||||
user_posts[user_key].append(post)
|
||||
|
||||
# analyze each user
|
||||
for user_key, posts in user_posts.items():
|
||||
username = user_key.split('@')[0]
|
||||
|
||||
if user_key in seen_users:
|
||||
continue
|
||||
seen_users.add(user_key)
|
||||
|
||||
result = analyze_lemmy_user(instance, username, posts)
|
||||
if not result:
|
||||
continue
|
||||
|
||||
if result['score'] >= 20 or result.get('lost_potential_score', 0) >= 30:
|
||||
db.save_human(result)
|
||||
found += 1
|
||||
|
||||
if result.get('user_type') in ['lost', 'both']:
|
||||
lost_found += 1
|
||||
print(f" {result['username']}: {result['score']:.0f} (lost: {result['lost_potential_score']:.0f})")
|
||||
elif result['score'] >= 40:
|
||||
print(f" {result['username']}: {result['score']:.0f}")
|
||||
|
||||
time.sleep(0.5) # rate limit
|
||||
|
||||
time.sleep(1) # between communities
|
||||
|
||||
time.sleep(2) # between instances
|
||||
|
||||
print(f"lemmy: found {found} humans ({lost_found} lost builders)")
|
||||
return found
|
||||
|
|
@ -1,169 +0,0 @@
|
|||
"""
|
||||
scoutd/lobsters.py - lobste.rs discovery
|
||||
high-signal invite-only tech community
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from .signals import analyze_text
|
||||
|
||||
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'lobsters'
|
||||
|
||||
ALIGNED_TAGS = ['privacy', 'security', 'distributed', 'rust', 'linux', 'culture', 'practices']
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited request"""
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(2)
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" lobsters api error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def get_stories_by_tag(tag):
|
||||
"""get recent stories by tag"""
|
||||
url = f'https://lobste.rs/t/{tag}.json'
|
||||
return _api_get(url) or []
|
||||
|
||||
|
||||
def get_newest_stories():
|
||||
"""get newest stories"""
|
||||
return _api_get('https://lobste.rs/newest.json') or []
|
||||
|
||||
|
||||
def get_user(username):
|
||||
"""get user profile"""
|
||||
return _api_get(f'https://lobste.rs/u/{username}.json')
|
||||
|
||||
|
||||
def analyze_lobsters_user(username):
|
||||
"""analyze a lobste.rs user"""
|
||||
user = get_user(username)
|
||||
if not user:
|
||||
return None
|
||||
|
||||
text_parts = []
|
||||
if user.get('about'):
|
||||
text_parts.append(user['about'])
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# lobsters base bonus (invite-only, high signal)
|
||||
base_score = 15
|
||||
|
||||
# karma bonus
|
||||
karma = user.get('karma', 0)
|
||||
karma_score = 0
|
||||
if karma > 100:
|
||||
karma_score = 10
|
||||
elif karma > 50:
|
||||
karma_score = 5
|
||||
|
||||
# github presence
|
||||
github_score = 5 if user.get('github_username') else 0
|
||||
|
||||
# homepage
|
||||
homepage_score = 5 if user.get('homepage') else 0
|
||||
|
||||
total_score = text_score + base_score + karma_score + github_score + homepage_score
|
||||
|
||||
# confidence
|
||||
confidence = 0.4 # higher base for invite-only
|
||||
if text_parts:
|
||||
confidence += 0.2
|
||||
if karma > 50:
|
||||
confidence += 0.2
|
||||
confidence = min(confidence, 0.9)
|
||||
|
||||
reasons = ['on lobste.rs (invite-only)']
|
||||
if karma > 50:
|
||||
reasons.append(f"active ({karma} karma)")
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
|
||||
return {
|
||||
'platform': 'lobsters',
|
||||
'username': username,
|
||||
'url': f"https://lobste.rs/u/{username}",
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'karma': karma,
|
||||
'reasons': reasons,
|
||||
'contact': {
|
||||
'github': user.get('github_username'),
|
||||
'twitter': user.get('twitter_username'),
|
||||
'homepage': user.get('homepage'),
|
||||
},
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
}
|
||||
|
||||
|
||||
def scrape_lobsters(db):
|
||||
"""full lobste.rs scrape"""
|
||||
print("scoutd/lobsters: starting scrape...")
|
||||
|
||||
all_users = set()
|
||||
|
||||
# stories by aligned tags
|
||||
for tag in ALIGNED_TAGS:
|
||||
print(f" tag: {tag}...")
|
||||
stories = get_stories_by_tag(tag)
|
||||
for story in stories:
|
||||
submitter = story.get('submitter_user', {}).get('username')
|
||||
if submitter:
|
||||
all_users.add(submitter)
|
||||
|
||||
# newest stories
|
||||
print(" newest stories...")
|
||||
for story in get_newest_stories():
|
||||
submitter = story.get('submitter_user', {}).get('username')
|
||||
if submitter:
|
||||
all_users.add(submitter)
|
||||
|
||||
print(f" {len(all_users)} unique users to analyze")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
for username in all_users:
|
||||
try:
|
||||
result = analyze_lobsters_user(username)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
if result['score'] >= 30:
|
||||
print(f" ★ {username}: {result['score']} pts")
|
||||
except Exception as e:
|
||||
print(f" error on {username}: {e}")
|
||||
|
||||
print(f"scoutd/lobsters: found {len(results)} aligned humans")
|
||||
return results
|
||||
|
|
@ -1,491 +0,0 @@
|
|||
"""
|
||||
scoutd/lost.py - lost builder detection
|
||||
|
||||
finds people with potential who haven't found it yet, gave up, or are too beaten down to try.
|
||||
|
||||
these aren't failures. they're seeds that never got water.
|
||||
|
||||
detection signals:
|
||||
- github: forked but never modified, starred many but built nothing, learning repos abandoned
|
||||
- reddit/forums: "i wish i could...", stuck asking beginner questions for years, helping others but never sharing
|
||||
- social: retoots builders but never posts own work, imposter syndrome language, isolation signals
|
||||
- profiles: bio says what they WANT to be, "aspiring" for 2+ years, empty portfolios
|
||||
|
||||
the goal isn't to recruit them. it's to show them the door exists.
|
||||
"""
|
||||
|
||||
import re
|
||||
from datetime import datetime, timedelta
|
||||
from collections import defaultdict
|
||||
|
||||
|
||||
# signal definitions with weights
|
||||
LOST_SIGNALS = {
|
||||
# github signals
|
||||
'forked_never_modified': {
|
||||
'weight': 15,
|
||||
'category': 'github',
|
||||
'description': 'forked repos but never pushed changes',
|
||||
},
|
||||
'starred_many_built_nothing': {
|
||||
'weight': 20,
|
||||
'category': 'github',
|
||||
'description': 'starred 50+ repos but has 0-2 own repos',
|
||||
},
|
||||
'account_no_repos': {
|
||||
'weight': 10,
|
||||
'category': 'github',
|
||||
'description': 'account exists but no public repos',
|
||||
},
|
||||
'inactivity_bursts': {
|
||||
'weight': 15,
|
||||
'category': 'github',
|
||||
'description': 'long gaps then brief activity bursts',
|
||||
},
|
||||
'only_issues_comments': {
|
||||
'weight': 12,
|
||||
'category': 'github',
|
||||
'description': 'only activity is issues/comments on others work',
|
||||
},
|
||||
'abandoned_learning_repos': {
|
||||
'weight': 18,
|
||||
'category': 'github',
|
||||
'description': 'learning/tutorial repos that were never finished',
|
||||
},
|
||||
'readme_only_repos': {
|
||||
'weight': 10,
|
||||
'category': 'github',
|
||||
'description': 'repos with just README, no actual code',
|
||||
},
|
||||
|
||||
# language signals (from posts/comments/bio)
|
||||
'wish_i_could': {
|
||||
'weight': 12,
|
||||
'category': 'language',
|
||||
'description': '"i wish i could..." language',
|
||||
'patterns': [
|
||||
r'i wish i could',
|
||||
r'i wish i knew how',
|
||||
r'wish i had the (time|energy|motivation|skills?)',
|
||||
],
|
||||
},
|
||||
'someday_want': {
|
||||
'weight': 10,
|
||||
'category': 'language',
|
||||
'description': '"someday i want to..." language',
|
||||
'patterns': [
|
||||
r'someday i (want|hope|plan) to',
|
||||
r'one day i\'ll',
|
||||
r'eventually i\'ll',
|
||||
r'when i have time i\'ll',
|
||||
],
|
||||
},
|
||||
'stuck_beginner': {
|
||||
'weight': 20,
|
||||
'category': 'language',
|
||||
'description': 'asking beginner questions for years',
|
||||
'patterns': [
|
||||
r'still (trying|learning|struggling) (to|with)',
|
||||
r'can\'t seem to (get|understand|figure)',
|
||||
r'been trying for (months|years)',
|
||||
],
|
||||
},
|
||||
'self_deprecating': {
|
||||
'weight': 15,
|
||||
'category': 'language',
|
||||
'description': 'self-deprecating about abilities',
|
||||
'patterns': [
|
||||
r'i\'m (not smart|too dumb|not good) enough',
|
||||
r'i (suck|am terrible) at',
|
||||
r'i\'ll never be able to',
|
||||
r'people like me (can\'t|don\'t)',
|
||||
r'i\'m just not (a|the) (type|kind)',
|
||||
],
|
||||
},
|
||||
'no_energy': {
|
||||
'weight': 18,
|
||||
'category': 'language',
|
||||
'description': '"how do people have energy" posts',
|
||||
'patterns': [
|
||||
r'how do (people|you|they) have (the )?(energy|time|motivation)',
|
||||
r'where do (people|you|they) find (the )?(energy|motivation)',
|
||||
r'i\'m (always|constantly) (tired|exhausted|drained)',
|
||||
r'no (energy|motivation) (left|anymore)',
|
||||
],
|
||||
},
|
||||
'imposter_syndrome': {
|
||||
'weight': 15,
|
||||
'category': 'language',
|
||||
'description': 'imposter syndrome language',
|
||||
'patterns': [
|
||||
r'imposter syndrome',
|
||||
r'feel like (a |an )?(fraud|fake|imposter)',
|
||||
r'don\'t (belong|deserve)',
|
||||
r'everyone else (seems|is) (so much )?(better|smarter)',
|
||||
r'they\'ll (find out|realize) i\'m',
|
||||
],
|
||||
},
|
||||
'should_really': {
|
||||
'weight': 8,
|
||||
'category': 'language',
|
||||
'description': '"i should really..." posts',
|
||||
'patterns': [
|
||||
r'i (should|need to) really',
|
||||
r'i keep (meaning|wanting) to',
|
||||
r'i\'ve been (meaning|wanting) to',
|
||||
],
|
||||
},
|
||||
'isolation_signals': {
|
||||
'weight': 20,
|
||||
'category': 'language',
|
||||
'description': 'isolation/loneliness language',
|
||||
'patterns': [
|
||||
r'no one (understands|gets it|to talk to)',
|
||||
r'(feel|feeling) (so )?(alone|isolated|lonely)',
|
||||
r'don\'t have anyone (to|who)',
|
||||
r'wish i (had|knew) (someone|people)',
|
||||
],
|
||||
},
|
||||
'enthusiasm_for_others': {
|
||||
'weight': 10,
|
||||
'category': 'behavior',
|
||||
'description': 'celebrates others but dismissive of self',
|
||||
},
|
||||
|
||||
# subreddit/community signals
|
||||
'stuck_communities': {
|
||||
'weight': 15,
|
||||
'category': 'community',
|
||||
'description': 'active in stuck/struggling communities',
|
||||
'subreddits': [
|
||||
'learnprogramming',
|
||||
'findapath',
|
||||
'getdisciplined',
|
||||
'getmotivated',
|
||||
'decidingtobebetter',
|
||||
'selfimprovement',
|
||||
'adhd',
|
||||
'depression',
|
||||
'anxiety',
|
||||
],
|
||||
},
|
||||
|
||||
# profile signals
|
||||
'aspirational_bio': {
|
||||
'weight': 12,
|
||||
'category': 'profile',
|
||||
'description': 'bio says what they WANT to be',
|
||||
'patterns': [
|
||||
r'aspiring',
|
||||
r'future',
|
||||
r'want(ing)? to (be|become)',
|
||||
r'learning to',
|
||||
r'trying to (become|be|learn)',
|
||||
r'hoping to',
|
||||
],
|
||||
},
|
||||
'empty_portfolio': {
|
||||
'weight': 15,
|
||||
'category': 'profile',
|
||||
'description': 'links to empty portfolio sites',
|
||||
},
|
||||
'long_aspiring': {
|
||||
'weight': 20,
|
||||
'category': 'profile',
|
||||
'description': '"aspiring" in bio for 2+ years',
|
||||
},
|
||||
}
|
||||
|
||||
# subreddits that indicate someone might be stuck
|
||||
STUCK_SUBREDDITS = {
|
||||
'learnprogramming': 8,
|
||||
'findapath': 15,
|
||||
'getdisciplined': 12,
|
||||
'getmotivated': 10,
|
||||
'decidingtobebetter': 12,
|
||||
'selfimprovement': 8,
|
||||
'adhd': 10,
|
||||
'depression': 15,
|
||||
'anxiety': 12,
|
||||
'socialanxiety': 12,
|
||||
'neet': 20,
|
||||
'lostgeneration': 15,
|
||||
'antiwork': 5, # could be aligned OR stuck
|
||||
'careerguidance': 8,
|
||||
'cscareerquestions': 5,
|
||||
}
|
||||
|
||||
|
||||
def analyze_text_for_lost_signals(text):
|
||||
"""analyze text for lost builder language patterns"""
|
||||
if not text:
|
||||
return [], 0
|
||||
|
||||
text_lower = text.lower()
|
||||
signals_found = []
|
||||
total_weight = 0
|
||||
|
||||
for signal_name, signal_data in LOST_SIGNALS.items():
|
||||
if 'patterns' not in signal_data:
|
||||
continue
|
||||
|
||||
for pattern in signal_data['patterns']:
|
||||
if re.search(pattern, text_lower):
|
||||
signals_found.append(signal_name)
|
||||
total_weight += signal_data['weight']
|
||||
break # only count each signal once
|
||||
|
||||
return signals_found, total_weight
|
||||
|
||||
|
||||
def analyze_github_for_lost_signals(profile):
|
||||
"""analyze github profile for lost builder signals"""
|
||||
signals_found = []
|
||||
total_weight = 0
|
||||
|
||||
if not profile:
|
||||
return signals_found, total_weight
|
||||
|
||||
repos = profile.get('repos', []) or profile.get('top_repos', [])
|
||||
extra = profile.get('extra', {})
|
||||
|
||||
public_repos = profile.get('public_repos', len(repos))
|
||||
followers = profile.get('followers', 0)
|
||||
following = profile.get('following', 0)
|
||||
|
||||
# starred many but built nothing
|
||||
# (we'd need to fetch starred count separately, approximate with following ratio)
|
||||
if public_repos <= 2 and following > 50:
|
||||
signals_found.append('starred_many_built_nothing')
|
||||
total_weight += LOST_SIGNALS['starred_many_built_nothing']['weight']
|
||||
|
||||
# account but no repos
|
||||
if public_repos == 0:
|
||||
signals_found.append('account_no_repos')
|
||||
total_weight += LOST_SIGNALS['account_no_repos']['weight']
|
||||
|
||||
# check repos for signals
|
||||
forked_count = 0
|
||||
forked_modified = 0
|
||||
learning_repos = 0
|
||||
readme_only = 0
|
||||
|
||||
learning_keywords = ['learning', 'tutorial', 'course', 'practice', 'exercise',
|
||||
'bootcamp', 'udemy', 'freecodecamp', 'odin', 'codecademy']
|
||||
|
||||
for repo in repos:
|
||||
name = (repo.get('name') or '').lower()
|
||||
description = (repo.get('description') or '').lower()
|
||||
language = repo.get('language')
|
||||
is_fork = repo.get('fork', False)
|
||||
|
||||
# forked but never modified
|
||||
if is_fork:
|
||||
forked_count += 1
|
||||
# if pushed_at is close to created_at, never modified
|
||||
# (simplified: just count forks for now)
|
||||
|
||||
# learning/tutorial repos
|
||||
if any(kw in name or kw in description for kw in learning_keywords):
|
||||
learning_repos += 1
|
||||
|
||||
# readme only (no language detected usually means no code)
|
||||
if not language and not is_fork:
|
||||
readme_only += 1
|
||||
|
||||
if forked_count >= 5 and public_repos - forked_count <= 2:
|
||||
signals_found.append('forked_never_modified')
|
||||
total_weight += LOST_SIGNALS['forked_never_modified']['weight']
|
||||
|
||||
if learning_repos >= 3:
|
||||
signals_found.append('abandoned_learning_repos')
|
||||
total_weight += LOST_SIGNALS['abandoned_learning_repos']['weight']
|
||||
|
||||
if readme_only >= 2:
|
||||
signals_found.append('readme_only_repos')
|
||||
total_weight += LOST_SIGNALS['readme_only_repos']['weight']
|
||||
|
||||
# check bio for lost signals
|
||||
bio = profile.get('bio') or ''
|
||||
bio_signals, bio_weight = analyze_text_for_lost_signals(bio)
|
||||
signals_found.extend(bio_signals)
|
||||
total_weight += bio_weight
|
||||
|
||||
# aspirational bio check
|
||||
bio_lower = bio.lower()
|
||||
if any(re.search(p, bio_lower) for p in LOST_SIGNALS['aspirational_bio']['patterns']):
|
||||
if 'aspirational_bio' not in signals_found:
|
||||
signals_found.append('aspirational_bio')
|
||||
total_weight += LOST_SIGNALS['aspirational_bio']['weight']
|
||||
|
||||
return signals_found, total_weight
|
||||
|
||||
|
||||
def analyze_reddit_for_lost_signals(activity, subreddits):
|
||||
"""analyze reddit activity for lost builder signals"""
|
||||
signals_found = []
|
||||
total_weight = 0
|
||||
|
||||
# check subreddit activity
|
||||
stuck_sub_activity = 0
|
||||
for sub in subreddits:
|
||||
if sub.lower() in STUCK_SUBREDDITS:
|
||||
stuck_sub_activity += STUCK_SUBREDDITS[sub.lower()]
|
||||
|
||||
if stuck_sub_activity >= 20:
|
||||
signals_found.append('stuck_communities')
|
||||
total_weight += min(stuck_sub_activity, 30) # cap at 30
|
||||
|
||||
# analyze post/comment text
|
||||
all_text = []
|
||||
for item in activity:
|
||||
if item.get('title'):
|
||||
all_text.append(item['title'])
|
||||
if item.get('body'):
|
||||
all_text.append(item['body'])
|
||||
|
||||
combined_text = ' '.join(all_text)
|
||||
text_signals, text_weight = analyze_text_for_lost_signals(combined_text)
|
||||
signals_found.extend(text_signals)
|
||||
total_weight += text_weight
|
||||
|
||||
# check for helping others but never sharing own work
|
||||
help_count = 0
|
||||
share_count = 0
|
||||
for item in activity:
|
||||
body = (item.get('body') or '').lower()
|
||||
title = (item.get('title') or '').lower()
|
||||
|
||||
# helping patterns
|
||||
if any(p in body for p in ['try this', 'you could', 'have you tried', 'i recommend']):
|
||||
help_count += 1
|
||||
|
||||
# sharing patterns
|
||||
if any(p in body + title for p in ['i built', 'i made', 'my project', 'check out my', 'i created']):
|
||||
share_count += 1
|
||||
|
||||
if help_count >= 5 and share_count == 0:
|
||||
signals_found.append('enthusiasm_for_others')
|
||||
total_weight += LOST_SIGNALS['enthusiasm_for_others']['weight']
|
||||
|
||||
return signals_found, total_weight
|
||||
|
||||
|
||||
def analyze_social_for_lost_signals(profile, posts):
|
||||
"""analyze mastodon/social for lost builder signals"""
|
||||
signals_found = []
|
||||
total_weight = 0
|
||||
|
||||
# check bio
|
||||
bio = profile.get('bio') or profile.get('note') or ''
|
||||
bio_signals, bio_weight = analyze_text_for_lost_signals(bio)
|
||||
signals_found.extend(bio_signals)
|
||||
total_weight += bio_weight
|
||||
|
||||
# check posts
|
||||
boost_count = 0
|
||||
original_count = 0
|
||||
own_work_count = 0
|
||||
|
||||
for post in posts:
|
||||
content = (post.get('content') or '').lower()
|
||||
is_boost = post.get('reblog') is not None or post.get('repost')
|
||||
|
||||
if is_boost:
|
||||
boost_count += 1
|
||||
else:
|
||||
original_count += 1
|
||||
|
||||
# check if sharing own work
|
||||
if any(p in content for p in ['i built', 'i made', 'my project', 'working on', 'just shipped']):
|
||||
own_work_count += 1
|
||||
|
||||
# analyze text
|
||||
text_signals, text_weight = analyze_text_for_lost_signals(content)
|
||||
for sig in text_signals:
|
||||
if sig not in signals_found:
|
||||
signals_found.append(sig)
|
||||
total_weight += LOST_SIGNALS[sig]['weight']
|
||||
|
||||
# boosts builders but never posts own work
|
||||
if boost_count >= 10 and own_work_count == 0:
|
||||
signals_found.append('enthusiasm_for_others')
|
||||
total_weight += LOST_SIGNALS['enthusiasm_for_others']['weight']
|
||||
|
||||
return signals_found, total_weight
|
||||
|
||||
|
||||
def calculate_lost_potential_score(signals_found):
|
||||
"""calculate overall lost potential score from signals"""
|
||||
total = 0
|
||||
for signal in signals_found:
|
||||
if signal in LOST_SIGNALS:
|
||||
total += LOST_SIGNALS[signal]['weight']
|
||||
return total
|
||||
|
||||
|
||||
def classify_user(lost_score, builder_score, values_score):
|
||||
"""
|
||||
classify user as builder, lost, or neither
|
||||
|
||||
returns: 'builder' | 'lost' | 'both' | 'none'
|
||||
"""
|
||||
# high builder score = active builder
|
||||
if builder_score >= 50 and lost_score < 30:
|
||||
return 'builder'
|
||||
|
||||
# high lost score + values alignment = lost builder (priority outreach)
|
||||
if lost_score >= 40 and values_score >= 20:
|
||||
return 'lost'
|
||||
|
||||
# both signals = complex case, might be recovering
|
||||
if lost_score >= 30 and builder_score >= 30:
|
||||
return 'both'
|
||||
|
||||
return 'none'
|
||||
|
||||
|
||||
def get_signal_descriptions(signals_found):
|
||||
"""get human-readable descriptions of detected signals"""
|
||||
descriptions = []
|
||||
for signal in signals_found:
|
||||
if signal in LOST_SIGNALS:
|
||||
descriptions.append(LOST_SIGNALS[signal]['description'])
|
||||
return descriptions
|
||||
|
||||
|
||||
def should_outreach_lost(user_data, config=None):
|
||||
"""
|
||||
determine if we should reach out to a lost builder
|
||||
|
||||
considers:
|
||||
- lost_potential_score threshold
|
||||
- values alignment
|
||||
- cooldown period
|
||||
- manual review requirement
|
||||
"""
|
||||
config = config or {}
|
||||
|
||||
lost_score = user_data.get('lost_potential_score', 0)
|
||||
values_score = user_data.get('score', 0) # regular alignment score
|
||||
|
||||
# minimum thresholds
|
||||
min_lost = config.get('min_lost_score', 40)
|
||||
min_values = config.get('min_values_score', 20)
|
||||
|
||||
if lost_score < min_lost:
|
||||
return False, 'lost_score too low'
|
||||
|
||||
if values_score < min_values:
|
||||
return False, 'values_score too low'
|
||||
|
||||
# check cooldown
|
||||
last_outreach = user_data.get('last_lost_outreach')
|
||||
if last_outreach:
|
||||
cooldown_days = config.get('cooldown_days', 90)
|
||||
last_dt = datetime.fromisoformat(last_outreach)
|
||||
if datetime.now() - last_dt < timedelta(days=cooldown_days):
|
||||
return False, f'cooldown active (90 days)'
|
||||
|
||||
# always require manual review for lost outreach
|
||||
return True, 'requires_review'
|
||||
|
|
@ -1,290 +0,0 @@
|
|||
"""
|
||||
scoutd/mastodon.py - fediverse discovery
|
||||
scrapes high-signal instances: tech.lgbt, social.coop, fosstodon, hackers.town
|
||||
also detects lost builders - social isolation, imposter syndrome, struggling folks
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import re
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
from .signals import analyze_text, ALIGNED_INSTANCES
|
||||
from .lost import (
|
||||
analyze_social_for_lost_signals,
|
||||
analyze_text_for_lost_signals,
|
||||
classify_user,
|
||||
get_signal_descriptions,
|
||||
)
|
||||
|
||||
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'mastodon'
|
||||
|
||||
TARGET_HASHTAGS = [
|
||||
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
|
||||
'privacy', 'solarpunk', 'cooperative', 'cohousing', 'mutualaid',
|
||||
'intentionalcommunity', 'degoogle', 'fediverse', 'indieweb',
|
||||
]
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited request"""
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(1)
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" mastodon api error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def strip_html(text):
|
||||
"""strip html tags"""
|
||||
return re.sub(r'<[^>]+>', ' ', text) if text else ''
|
||||
|
||||
|
||||
def get_instance_directory(instance, limit=40):
|
||||
"""get users from instance directory"""
|
||||
url = f'https://{instance}/api/v1/directory'
|
||||
return _api_get(url, {'limit': limit, 'local': 'true'}) or []
|
||||
|
||||
|
||||
def get_hashtag_timeline(instance, hashtag, limit=40):
|
||||
"""get posts from hashtag"""
|
||||
url = f'https://{instance}/api/v1/timelines/tag/{hashtag}'
|
||||
return _api_get(url, {'limit': limit}) or []
|
||||
|
||||
|
||||
def get_user_statuses(instance, user_id, limit=30):
|
||||
"""get user's recent posts"""
|
||||
url = f'https://{instance}/api/v1/accounts/{user_id}/statuses'
|
||||
return _api_get(url, {'limit': limit, 'exclude_reblogs': 'true'}) or []
|
||||
|
||||
|
||||
def analyze_mastodon_user(account, instance):
|
||||
"""analyze a mastodon account"""
|
||||
acct = account.get('acct', '')
|
||||
if '@' not in acct:
|
||||
acct = f"{acct}@{instance}"
|
||||
|
||||
# collect text
|
||||
text_parts = []
|
||||
bio = strip_html(account.get('note', ''))
|
||||
if bio:
|
||||
text_parts.append(bio)
|
||||
|
||||
display_name = account.get('display_name', '')
|
||||
if display_name:
|
||||
text_parts.append(display_name)
|
||||
|
||||
# profile fields
|
||||
for field in account.get('fields', []):
|
||||
if field.get('name'):
|
||||
text_parts.append(field['name'])
|
||||
if field.get('value'):
|
||||
text_parts.append(strip_html(field['value']))
|
||||
|
||||
# get recent posts
|
||||
user_id = account.get('id')
|
||||
if user_id:
|
||||
statuses = get_user_statuses(instance, user_id)
|
||||
for status in statuses:
|
||||
content = strip_html(status.get('content', ''))
|
||||
if content:
|
||||
text_parts.append(content)
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# instance bonus
|
||||
instance_bonus = ALIGNED_INSTANCES.get(instance, 0)
|
||||
total_score = text_score + instance_bonus
|
||||
|
||||
# pronouns bonus
|
||||
if re.search(r'\b(they/them|she/her|he/him|xe/xem)\b', full_text, re.I):
|
||||
total_score += 10
|
||||
positive_signals.append('pronouns')
|
||||
|
||||
# activity level
|
||||
statuses_count = account.get('statuses_count', 0)
|
||||
followers = account.get('followers_count', 0)
|
||||
if statuses_count > 100:
|
||||
total_score += 5
|
||||
|
||||
# === LOST BUILDER DETECTION ===
|
||||
# build profile and posts for lost analysis
|
||||
profile_for_lost = {
|
||||
'bio': bio,
|
||||
'note': account.get('note'),
|
||||
}
|
||||
|
||||
# convert statuses to posts format for analyze_social_for_lost_signals
|
||||
posts_for_lost = []
|
||||
if user_id:
|
||||
statuses = get_user_statuses(instance, user_id)
|
||||
for status in statuses:
|
||||
posts_for_lost.append({
|
||||
'content': strip_html(status.get('content', '')),
|
||||
'reblog': status.get('reblog'),
|
||||
})
|
||||
|
||||
# analyze for lost signals
|
||||
lost_signals, lost_weight = analyze_social_for_lost_signals(profile_for_lost, posts_for_lost)
|
||||
|
||||
# also check combined text for lost patterns
|
||||
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
|
||||
for sig in text_lost_signals:
|
||||
if sig not in lost_signals:
|
||||
lost_signals.append(sig)
|
||||
lost_weight += text_lost_weight
|
||||
|
||||
lost_potential_score = lost_weight
|
||||
|
||||
# classify: builder, lost, both, or none
|
||||
# for mastodon, we use statuses_count as a proxy for builder activity
|
||||
builder_activity = 10 if statuses_count > 100 else 5 if statuses_count > 50 else 0
|
||||
user_type = classify_user(lost_potential_score, builder_activity, total_score)
|
||||
|
||||
# confidence
|
||||
confidence = 0.3
|
||||
if len(text_parts) > 5:
|
||||
confidence += 0.2
|
||||
if statuses_count > 50:
|
||||
confidence += 0.2
|
||||
if len(positive_signals) > 3:
|
||||
confidence += 0.2
|
||||
confidence = min(confidence, 0.9)
|
||||
|
||||
reasons = []
|
||||
if instance in ALIGNED_INSTANCES:
|
||||
reasons.append(f"on {instance}")
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
|
||||
# add lost reasons if applicable
|
||||
if user_type == 'lost' or user_type == 'both':
|
||||
lost_descriptions = get_signal_descriptions(lost_signals)
|
||||
if lost_descriptions:
|
||||
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
|
||||
|
||||
return {
|
||||
'platform': 'mastodon',
|
||||
'username': acct,
|
||||
'url': account.get('url'),
|
||||
'name': display_name,
|
||||
'bio': bio,
|
||||
'instance': instance,
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'statuses_count': statuses_count,
|
||||
'followers': followers,
|
||||
'reasons': reasons,
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
# lost builder fields
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type,
|
||||
}
|
||||
|
||||
|
||||
def scrape_mastodon(db, limit_per_instance=40):
|
||||
"""full mastodon scrape"""
|
||||
print("scoutd/mastodon: starting scrape...")
|
||||
|
||||
all_accounts = []
|
||||
|
||||
# 1. instance directories
|
||||
print(" scraping instance directories...")
|
||||
for instance in ALIGNED_INSTANCES:
|
||||
accounts = get_instance_directory(instance, limit=limit_per_instance)
|
||||
for acct in accounts:
|
||||
acct['_instance'] = instance
|
||||
all_accounts.append(acct)
|
||||
print(f" {instance}: {len(accounts)} users")
|
||||
|
||||
# 2. hashtag timelines
|
||||
print(" scraping hashtags...")
|
||||
seen = set()
|
||||
for tag in TARGET_HASHTAGS[:8]:
|
||||
for instance in ['fosstodon.org', 'tech.lgbt', 'social.coop']:
|
||||
posts = get_hashtag_timeline(instance, tag, limit=20)
|
||||
for post in posts:
|
||||
account = post.get('account', {})
|
||||
acct = account.get('acct', '')
|
||||
if '@' not in acct:
|
||||
acct = f"{acct}@{instance}"
|
||||
|
||||
if acct not in seen:
|
||||
seen.add(acct)
|
||||
account['_instance'] = instance
|
||||
all_accounts.append(account)
|
||||
|
||||
# dedupe
|
||||
unique = {}
|
||||
for acct in all_accounts:
|
||||
key = acct.get('acct', acct.get('id', ''))
|
||||
if key not in unique:
|
||||
unique[key] = acct
|
||||
|
||||
print(f" {len(unique)} unique accounts to analyze")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
builders_found = 0
|
||||
lost_found = 0
|
||||
|
||||
for acct_data in unique.values():
|
||||
instance = acct_data.get('_instance', 'mastodon.social')
|
||||
try:
|
||||
result = analyze_mastodon_user(acct_data, instance)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
user_type = result.get('user_type', 'none')
|
||||
|
||||
if user_type == 'builder':
|
||||
builders_found += 1
|
||||
if result['score'] >= 40:
|
||||
print(f" ★ @{result['username']}: {result['score']} pts")
|
||||
|
||||
elif user_type == 'lost':
|
||||
lost_found += 1
|
||||
lost_score = result.get('lost_potential_score', 0)
|
||||
if lost_score >= 40:
|
||||
print(f" 💔 @{result['username']}: lost_score={lost_score}, values={result['score']} pts")
|
||||
|
||||
elif user_type == 'both':
|
||||
builders_found += 1
|
||||
lost_found += 1
|
||||
print(f" ⚡ @{result['username']}: recovering builder")
|
||||
|
||||
except Exception as e:
|
||||
print(f" error: {e}")
|
||||
|
||||
print(f"scoutd/mastodon: found {len(results)} aligned humans")
|
||||
print(f" - {builders_found} active builders")
|
||||
print(f" - {lost_found} lost builders (need encouragement)")
|
||||
return results
|
||||
|
|
@ -1,196 +0,0 @@
|
|||
"""
|
||||
scoutd/matrix.py - matrix room membership discovery
|
||||
finds users in multiple aligned public rooms
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
||||
from .signals import analyze_text
|
||||
|
||||
HEADERS = {'User-Agent': 'connectd/1.0', 'Accept': 'application/json'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'matrix'
|
||||
|
||||
# public matrix rooms to check membership
|
||||
ALIGNED_ROOMS = [
|
||||
'#homeassistant:matrix.org',
|
||||
'#esphome:matrix.org',
|
||||
'#selfhosted:matrix.org',
|
||||
'#privacy:matrix.org',
|
||||
'#solarpunk:matrix.org',
|
||||
'#cooperative:matrix.org',
|
||||
'#foss:matrix.org',
|
||||
'#linux:matrix.org',
|
||||
]
|
||||
|
||||
# homeservers to query
|
||||
HOMESERVERS = [
|
||||
'matrix.org',
|
||||
'matrix.envs.net',
|
||||
'tchncs.de',
|
||||
]
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited request"""
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(1)
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
# matrix apis often fail, don't spam errors
|
||||
return None
|
||||
|
||||
|
||||
def get_room_members(homeserver, room_alias):
|
||||
"""
|
||||
get members of a public room
|
||||
note: most matrix servers don't expose this publicly
|
||||
this is a best-effort scrape
|
||||
"""
|
||||
# resolve room alias to id first
|
||||
try:
|
||||
alias_url = f'https://{homeserver}/_matrix/client/r0/directory/room/{room_alias}'
|
||||
alias_data = _api_get(alias_url)
|
||||
if not alias_data or 'room_id' not in alias_data:
|
||||
return []
|
||||
|
||||
room_id = alias_data['room_id']
|
||||
|
||||
# try to get members (usually requires auth)
|
||||
members_url = f'https://{homeserver}/_matrix/client/r0/rooms/{room_id}/members'
|
||||
members_data = _api_get(members_url)
|
||||
|
||||
if members_data and 'chunk' in members_data:
|
||||
members = []
|
||||
for event in members_data['chunk']:
|
||||
if event.get('type') == 'm.room.member' and event.get('content', {}).get('membership') == 'join':
|
||||
user_id = event.get('state_key')
|
||||
display_name = event.get('content', {}).get('displayname')
|
||||
if user_id:
|
||||
members.append({'user_id': user_id, 'display_name': display_name})
|
||||
return members
|
||||
except:
|
||||
pass
|
||||
|
||||
return []
|
||||
|
||||
|
||||
def get_public_rooms(homeserver, limit=100):
|
||||
"""get public rooms directory"""
|
||||
url = f'https://{homeserver}/_matrix/client/r0/publicRooms'
|
||||
data = _api_get(url, {'limit': limit})
|
||||
return data.get('chunk', []) if data else []
|
||||
|
||||
|
||||
def analyze_matrix_user(user_id, rooms_joined, display_name=None):
|
||||
"""analyze a matrix user based on room membership"""
|
||||
# score based on room membership overlap
|
||||
room_score = len(rooms_joined) * 10
|
||||
|
||||
# multi-room bonus
|
||||
if len(rooms_joined) >= 4:
|
||||
room_score += 20
|
||||
elif len(rooms_joined) >= 2:
|
||||
room_score += 10
|
||||
|
||||
# analyze display name if available
|
||||
text_score = 0
|
||||
signals = []
|
||||
if display_name:
|
||||
text_score, signals, _ = analyze_text(display_name)
|
||||
|
||||
total_score = room_score + text_score
|
||||
|
||||
confidence = 0.3
|
||||
if len(rooms_joined) >= 3:
|
||||
confidence += 0.3
|
||||
if display_name:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 0.8)
|
||||
|
||||
reasons = [f"in {len(rooms_joined)} aligned rooms: {', '.join(rooms_joined[:3])}"]
|
||||
if signals:
|
||||
reasons.append(f"signals: {', '.join(signals[:3])}")
|
||||
|
||||
return {
|
||||
'platform': 'matrix',
|
||||
'username': user_id,
|
||||
'url': f"https://matrix.to/#/{user_id}",
|
||||
'name': display_name,
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': signals,
|
||||
'rooms': rooms_joined,
|
||||
'reasons': reasons,
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
}
|
||||
|
||||
|
||||
def scrape_matrix(db):
|
||||
"""
|
||||
matrix scrape - limited due to auth requirements
|
||||
best effort on public room data
|
||||
"""
|
||||
print("scoutd/matrix: starting scrape (limited - most apis require auth)...")
|
||||
|
||||
user_rooms = defaultdict(list)
|
||||
|
||||
# try to get public room directories
|
||||
for homeserver in HOMESERVERS:
|
||||
print(f" checking {homeserver} public rooms...")
|
||||
rooms = get_public_rooms(homeserver, limit=50)
|
||||
|
||||
for room in rooms:
|
||||
room_alias = room.get('canonical_alias', '')
|
||||
# check if it matches any aligned room patterns
|
||||
aligned_keywords = ['homeassistant', 'selfhosted', 'privacy', 'linux', 'foss', 'cooperative']
|
||||
if any(kw in room_alias.lower() or kw in room.get('name', '').lower() for kw in aligned_keywords):
|
||||
print(f" found aligned room: {room_alias or room.get('name')}")
|
||||
|
||||
# try to get members from aligned rooms (usually fails without auth)
|
||||
for room_alias in ALIGNED_ROOMS[:3]: # limit attempts
|
||||
for homeserver in HOMESERVERS[:1]: # just try matrix.org
|
||||
members = get_room_members(homeserver, room_alias)
|
||||
if members:
|
||||
print(f" {room_alias}: {len(members)} members")
|
||||
for member in members:
|
||||
user_rooms[member['user_id']].append(room_alias)
|
||||
|
||||
# filter for multi-room users
|
||||
multi_room = {u: rooms for u, rooms in user_rooms.items() if len(rooms) >= 2}
|
||||
print(f" {len(multi_room)} users in 2+ aligned rooms")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
for user_id, rooms in multi_room.items():
|
||||
try:
|
||||
result = analyze_matrix_user(user_id, rooms)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
except Exception as e:
|
||||
print(f" error: {e}")
|
||||
|
||||
print(f"scoutd/matrix: found {len(results)} aligned humans (limited by auth)")
|
||||
return results
|
||||
|
|
@ -1,503 +0,0 @@
|
|||
"""
|
||||
scoutd/reddit.py - reddit discovery (DISCOVERY ONLY, NOT OUTREACH)
|
||||
|
||||
reddit is a SIGNAL SOURCE, not a contact channel.
|
||||
flow:
|
||||
1. scrape reddit for users active in target subs
|
||||
2. extract their reddit profile
|
||||
3. look for links TO other platforms (github, mastodon, website, etc.)
|
||||
4. add to scout database with reddit as signal source
|
||||
5. reach out via their OTHER platforms, never reddit
|
||||
|
||||
if reddit user has no external links:
|
||||
- add to manual_queue with note "reddit-only, needs manual review"
|
||||
|
||||
also detects lost builders - stuck in learnprogramming for years, imposter syndrome, etc.
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import re
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
||||
from .signals import analyze_text, ALIGNED_SUBREDDITS, NEGATIVE_SUBREDDITS
|
||||
from .lost import (
|
||||
analyze_reddit_for_lost_signals,
|
||||
analyze_text_for_lost_signals,
|
||||
classify_user,
|
||||
get_signal_descriptions,
|
||||
STUCK_SUBREDDITS,
|
||||
)
|
||||
|
||||
HEADERS = {'User-Agent': 'connectd:v1.0 (community discovery)'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'reddit'
|
||||
|
||||
# patterns for extracting external platform links
|
||||
PLATFORM_PATTERNS = {
|
||||
'github': [
|
||||
r'github\.com/([a-zA-Z0-9_-]+)',
|
||||
r'gh:\s*@?([a-zA-Z0-9_-]+)',
|
||||
],
|
||||
'mastodon': [
|
||||
r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})',
|
||||
r'mastodon\.social/@([a-zA-Z0-9_]+)',
|
||||
r'fosstodon\.org/@([a-zA-Z0-9_]+)',
|
||||
r'hachyderm\.io/@([a-zA-Z0-9_]+)',
|
||||
r'tech\.lgbt/@([a-zA-Z0-9_]+)',
|
||||
],
|
||||
'twitter': [
|
||||
r'twitter\.com/([a-zA-Z0-9_]+)',
|
||||
r'x\.com/([a-zA-Z0-9_]+)',
|
||||
r'(?:^|\s)@([a-zA-Z0-9_]{1,15})(?:\s|$)', # bare @handle
|
||||
],
|
||||
'bluesky': [
|
||||
r'bsky\.app/profile/([a-zA-Z0-9_.-]+)',
|
||||
r'([a-zA-Z0-9_-]+)\.bsky\.social',
|
||||
],
|
||||
'website': [
|
||||
r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)',
|
||||
],
|
||||
'matrix': [
|
||||
r'@([a-zA-Z0-9_-]+):([a-zA-Z0-9.-]+)',
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited request"""
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(2) # reddit rate limit
|
||||
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" reddit api error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def extract_external_links(text):
|
||||
"""extract links to other platforms from text"""
|
||||
links = {}
|
||||
|
||||
if not text:
|
||||
return links
|
||||
|
||||
for platform, patterns in PLATFORM_PATTERNS.items():
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, text, re.IGNORECASE)
|
||||
if matches:
|
||||
if platform == 'mastodon' and isinstance(matches[0], tuple):
|
||||
# full fediverse handle
|
||||
links[platform] = f"@{matches[0][0]}@{matches[0][1]}"
|
||||
elif platform == 'matrix' and isinstance(matches[0], tuple):
|
||||
links[platform] = f"@{matches[0][0]}:{matches[0][1]}"
|
||||
elif platform == 'website':
|
||||
# skip reddit/imgur/etc
|
||||
for match in matches:
|
||||
if not any(x in match.lower() for x in ['reddit', 'imgur', 'redd.it', 'i.redd']):
|
||||
links[platform] = f"https://{match}"
|
||||
break
|
||||
else:
|
||||
links[platform] = matches[0]
|
||||
break
|
||||
|
||||
return links
|
||||
|
||||
|
||||
def get_user_profile(username):
|
||||
"""get user profile including bio/description"""
|
||||
url = f'https://www.reddit.com/user/{username}/about.json'
|
||||
data = _api_get(url)
|
||||
|
||||
if not data or 'data' not in data:
|
||||
return None
|
||||
|
||||
profile = data['data']
|
||||
return {
|
||||
'username': username,
|
||||
'name': profile.get('name'),
|
||||
'bio': profile.get('subreddit', {}).get('public_description', ''),
|
||||
'title': profile.get('subreddit', {}).get('title', ''),
|
||||
'icon': profile.get('icon_img'),
|
||||
'created_utc': profile.get('created_utc'),
|
||||
'total_karma': profile.get('total_karma', 0),
|
||||
'link_karma': profile.get('link_karma', 0),
|
||||
'comment_karma': profile.get('comment_karma', 0),
|
||||
}
|
||||
|
||||
|
||||
def get_subreddit_users(subreddit, limit=100):
|
||||
"""get recent posters/commenters from a subreddit"""
|
||||
users = set()
|
||||
|
||||
# posts
|
||||
url = f'https://www.reddit.com/r/{subreddit}/new.json'
|
||||
data = _api_get(url, {'limit': limit})
|
||||
if data and 'data' in data:
|
||||
for post in data['data'].get('children', []):
|
||||
author = post['data'].get('author')
|
||||
if author and author not in ['[deleted]', 'AutoModerator']:
|
||||
users.add(author)
|
||||
|
||||
# comments
|
||||
url = f'https://www.reddit.com/r/{subreddit}/comments.json'
|
||||
data = _api_get(url, {'limit': limit})
|
||||
if data and 'data' in data:
|
||||
for comment in data['data'].get('children', []):
|
||||
author = comment['data'].get('author')
|
||||
if author and author not in ['[deleted]', 'AutoModerator']:
|
||||
users.add(author)
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def get_user_activity(username):
|
||||
"""get user's posts and comments"""
|
||||
activity = []
|
||||
|
||||
# posts
|
||||
url = f'https://www.reddit.com/user/{username}/submitted.json'
|
||||
data = _api_get(url, {'limit': 100})
|
||||
if data and 'data' in data:
|
||||
for post in data['data'].get('children', []):
|
||||
activity.append({
|
||||
'type': 'post',
|
||||
'subreddit': post['data'].get('subreddit'),
|
||||
'title': post['data'].get('title', ''),
|
||||
'body': post['data'].get('selftext', ''),
|
||||
'score': post['data'].get('score', 0),
|
||||
})
|
||||
|
||||
# comments
|
||||
url = f'https://www.reddit.com/user/{username}/comments.json'
|
||||
data = _api_get(url, {'limit': 100})
|
||||
if data and 'data' in data:
|
||||
for comment in data['data'].get('children', []):
|
||||
activity.append({
|
||||
'type': 'comment',
|
||||
'subreddit': comment['data'].get('subreddit'),
|
||||
'body': comment['data'].get('body', ''),
|
||||
'score': comment['data'].get('score', 0),
|
||||
})
|
||||
|
||||
return activity
|
||||
|
||||
|
||||
def analyze_reddit_user(username):
|
||||
"""
|
||||
analyze a reddit user for alignment and extract external platform links.
|
||||
|
||||
reddit is DISCOVERY ONLY - we find users here but contact them elsewhere.
|
||||
"""
|
||||
activity = get_user_activity(username)
|
||||
if not activity:
|
||||
return None
|
||||
|
||||
# get profile for bio
|
||||
profile = get_user_profile(username)
|
||||
|
||||
# count subreddit activity
|
||||
sub_activity = defaultdict(int)
|
||||
text_parts = []
|
||||
total_karma = 0
|
||||
|
||||
for item in activity:
|
||||
sub = item.get('subreddit', '').lower()
|
||||
if sub:
|
||||
sub_activity[sub] += 1
|
||||
if item.get('title'):
|
||||
text_parts.append(item['title'])
|
||||
if item.get('body'):
|
||||
text_parts.append(item['body'])
|
||||
total_karma += item.get('score', 0)
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# EXTRACT EXTERNAL LINKS - this is the key part
|
||||
# check profile bio first
|
||||
external_links = {}
|
||||
if profile:
|
||||
bio_text = f"{profile.get('bio', '')} {profile.get('title', '')}"
|
||||
external_links.update(extract_external_links(bio_text))
|
||||
|
||||
# also scan posts/comments for links (people often share their github etc)
|
||||
activity_links = extract_external_links(full_text)
|
||||
for platform, link in activity_links.items():
|
||||
if platform not in external_links:
|
||||
external_links[platform] = link
|
||||
|
||||
# subreddit scoring
|
||||
sub_score = 0
|
||||
aligned_subs = []
|
||||
for sub, count in sub_activity.items():
|
||||
weight = ALIGNED_SUBREDDITS.get(sub, 0)
|
||||
if weight > 0:
|
||||
sub_score += weight * min(count, 5)
|
||||
aligned_subs.append(sub)
|
||||
|
||||
# multi-sub bonus
|
||||
if len(aligned_subs) >= 5:
|
||||
sub_score += 30
|
||||
elif len(aligned_subs) >= 3:
|
||||
sub_score += 15
|
||||
|
||||
# negative sub penalty
|
||||
for sub in sub_activity:
|
||||
if sub.lower() in [n.lower() for n in NEGATIVE_SUBREDDITS]:
|
||||
sub_score -= 50
|
||||
negative_signals.append(f"r/{sub}")
|
||||
|
||||
total_score = text_score + sub_score
|
||||
|
||||
# bonus if they have external links (we can actually contact them)
|
||||
if external_links.get('github'):
|
||||
total_score += 10
|
||||
positive_signals.append('has github')
|
||||
if external_links.get('mastodon'):
|
||||
total_score += 10
|
||||
positive_signals.append('has mastodon')
|
||||
if external_links.get('website'):
|
||||
total_score += 5
|
||||
positive_signals.append('has website')
|
||||
|
||||
# === LOST BUILDER DETECTION ===
|
||||
# reddit is HIGH SIGNAL for lost builders - stuck in learnprogramming,
|
||||
# imposter syndrome posts, "i wish i could" language, etc.
|
||||
subreddits_list = list(sub_activity.keys())
|
||||
lost_signals, lost_weight = analyze_reddit_for_lost_signals(activity, subreddits_list)
|
||||
|
||||
# also check full text for lost patterns (already done partially in analyze_reddit_for_lost_signals)
|
||||
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
|
||||
for sig in text_lost_signals:
|
||||
if sig not in lost_signals:
|
||||
lost_signals.append(sig)
|
||||
lost_weight += text_lost_weight
|
||||
|
||||
lost_potential_score = lost_weight
|
||||
|
||||
# classify: builder, lost, both, or none
|
||||
# for reddit, builder_score is based on having external links + high karma
|
||||
builder_activity = 0
|
||||
if external_links.get('github'):
|
||||
builder_activity += 20
|
||||
if total_karma > 1000:
|
||||
builder_activity += 15
|
||||
elif total_karma > 500:
|
||||
builder_activity += 10
|
||||
|
||||
user_type = classify_user(lost_potential_score, builder_activity, total_score)
|
||||
|
||||
# confidence
|
||||
confidence = 0.3
|
||||
if len(activity) > 20:
|
||||
confidence += 0.2
|
||||
if len(aligned_subs) >= 2:
|
||||
confidence += 0.2
|
||||
if len(text_parts) > 10:
|
||||
confidence += 0.2
|
||||
# higher confidence if we have contact methods
|
||||
if external_links:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 0.95)
|
||||
|
||||
reasons = []
|
||||
if aligned_subs:
|
||||
reasons.append(f"active in: {', '.join(aligned_subs[:5])}")
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
if external_links:
|
||||
reasons.append(f"external: {', '.join(external_links.keys())}")
|
||||
|
||||
# add lost reasons if applicable
|
||||
if user_type == 'lost' or user_type == 'both':
|
||||
lost_descriptions = get_signal_descriptions(lost_signals)
|
||||
if lost_descriptions:
|
||||
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
|
||||
|
||||
# determine if this is reddit-only (needs manual review)
|
||||
reddit_only = len(external_links) == 0
|
||||
if reddit_only:
|
||||
reasons.append("REDDIT-ONLY: needs manual review for outreach")
|
||||
|
||||
return {
|
||||
'platform': 'reddit',
|
||||
'username': username,
|
||||
'url': f"https://reddit.com/u/{username}",
|
||||
'score': total_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'subreddits': aligned_subs,
|
||||
'activity_count': len(activity),
|
||||
'karma': total_karma,
|
||||
'reasons': reasons,
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
# external platform links for outreach
|
||||
'external_links': external_links,
|
||||
'reddit_only': reddit_only,
|
||||
'extra': {
|
||||
'github': external_links.get('github'),
|
||||
'mastodon': external_links.get('mastodon'),
|
||||
'twitter': external_links.get('twitter'),
|
||||
'bluesky': external_links.get('bluesky'),
|
||||
'website': external_links.get('website'),
|
||||
'matrix': external_links.get('matrix'),
|
||||
'reddit_karma': total_karma,
|
||||
'reddit_activity': len(activity),
|
||||
},
|
||||
# lost builder fields
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type,
|
||||
}
|
||||
|
||||
|
||||
def scrape_reddit(db, limit_per_sub=50):
|
||||
"""
|
||||
full reddit scrape - DISCOVERY ONLY
|
||||
|
||||
finds aligned users, extracts external links for outreach.
|
||||
reddit-only users go to manual queue.
|
||||
"""
|
||||
print("scoutd/reddit: starting scrape (discovery only, not outreach)...")
|
||||
|
||||
# find users in multiple aligned subs
|
||||
user_subs = defaultdict(set)
|
||||
|
||||
# aligned subs - active builders
|
||||
priority_subs = ['intentionalcommunity', 'cohousing', 'selfhosted',
|
||||
'homeassistant', 'solarpunk', 'cooperatives', 'privacy',
|
||||
'localllama', 'homelab', 'degoogle', 'pihole', 'unraid']
|
||||
|
||||
# lost builder subs - people who need encouragement
|
||||
# these folks might be stuck, but they have aligned interests
|
||||
lost_subs = ['learnprogramming', 'findapath', 'getdisciplined',
|
||||
'careerguidance', 'cscareerquestions', 'decidingtobebetter']
|
||||
|
||||
# scrape both - we want to find lost builders with aligned interests
|
||||
all_subs = priority_subs + lost_subs
|
||||
|
||||
for sub in all_subs:
|
||||
print(f" scraping r/{sub}...")
|
||||
users = get_subreddit_users(sub, limit=limit_per_sub)
|
||||
for user in users:
|
||||
user_subs[user].add(sub)
|
||||
print(f" found {len(users)} users")
|
||||
|
||||
# filter for multi-sub users
|
||||
multi_sub = {u: subs for u, subs in user_subs.items() if len(subs) >= 2}
|
||||
print(f" {len(multi_sub)} users in 2+ aligned subs")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
reddit_only_count = 0
|
||||
external_link_count = 0
|
||||
builders_found = 0
|
||||
lost_found = 0
|
||||
|
||||
for username in multi_sub:
|
||||
try:
|
||||
result = analyze_reddit_user(username)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
user_type = result.get('user_type', 'none')
|
||||
|
||||
# track lost builders - reddit is high signal for these
|
||||
if user_type == 'lost':
|
||||
lost_found += 1
|
||||
lost_score = result.get('lost_potential_score', 0)
|
||||
if lost_score >= 40:
|
||||
print(f" 💔 u/{username}: lost_score={lost_score}, values={result['score']} pts")
|
||||
# lost builders also go to manual queue if reddit-only
|
||||
if result.get('reddit_only'):
|
||||
_add_to_manual_queue(result)
|
||||
|
||||
elif user_type == 'builder':
|
||||
builders_found += 1
|
||||
|
||||
elif user_type == 'both':
|
||||
builders_found += 1
|
||||
lost_found += 1
|
||||
print(f" ⚡ u/{username}: recovering builder")
|
||||
|
||||
# track external links
|
||||
if result.get('reddit_only'):
|
||||
reddit_only_count += 1
|
||||
# add high-value users to manual queue for review
|
||||
if result['score'] >= 50 and user_type != 'lost': # lost already added above
|
||||
_add_to_manual_queue(result)
|
||||
print(f" 📋 u/{username}: {result['score']} pts (reddit-only → manual queue)")
|
||||
else:
|
||||
external_link_count += 1
|
||||
if result['score'] >= 50 and user_type == 'builder':
|
||||
links = list(result.get('external_links', {}).keys())
|
||||
print(f" ★ u/{username}: {result['score']} pts → {', '.join(links)}")
|
||||
|
||||
except Exception as e:
|
||||
print(f" error on {username}: {e}")
|
||||
|
||||
print(f"scoutd/reddit: found {len(results)} aligned humans")
|
||||
print(f" - {builders_found} active builders")
|
||||
print(f" - {lost_found} lost builders (need encouragement)")
|
||||
print(f" - {external_link_count} with external links (reachable)")
|
||||
print(f" - {reddit_only_count} reddit-only (manual queue)")
|
||||
return results
|
||||
|
||||
|
||||
def _add_to_manual_queue(result):
|
||||
"""add reddit-only user to manual queue for review"""
|
||||
from pathlib import Path
|
||||
import json
|
||||
|
||||
queue_file = Path(__file__).parent.parent / 'data' / 'manual_queue.json'
|
||||
queue_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
queue = []
|
||||
if queue_file.exists():
|
||||
try:
|
||||
queue = json.loads(queue_file.read_text())
|
||||
except:
|
||||
pass
|
||||
|
||||
# check if already in queue
|
||||
existing = [q for q in queue if q.get('username') == result['username'] and q.get('platform') == 'reddit']
|
||||
if existing:
|
||||
return
|
||||
|
||||
queue.append({
|
||||
'platform': 'reddit',
|
||||
'username': result['username'],
|
||||
'url': result['url'],
|
||||
'score': result['score'],
|
||||
'subreddits': result.get('subreddits', []),
|
||||
'signals': result.get('signals', []),
|
||||
'reasons': result.get('reasons', []),
|
||||
'note': 'reddit-only user - no external links found. DM manually if promising.',
|
||||
'queued_at': datetime.now().isoformat(),
|
||||
'status': 'pending',
|
||||
})
|
||||
|
||||
queue_file.write_text(json.dumps(queue, indent=2))
|
||||
|
|
@ -1,158 +0,0 @@
|
|||
"""
|
||||
shared signal patterns for all scrapers
|
||||
"""
|
||||
|
||||
import re
|
||||
|
||||
# positive signals - what we're looking for
|
||||
POSITIVE_PATTERNS = [
|
||||
# values
|
||||
(r'\b(solarpunk|cyberpunk)\b', 'solarpunk', 10),
|
||||
(r'\b(anarchis[tm]|mutual.?aid)\b', 'mutual_aid', 10),
|
||||
(r'\b(cooperative|collective|worker.?owned?|coop|co.?op)\b', 'cooperative', 15),
|
||||
(r'\b(community|commons)\b', 'community', 5),
|
||||
(r'\b(intentional.?community|cohousing|commune)\b', 'intentional_community', 20),
|
||||
|
||||
# queer-friendly
|
||||
(r'\b(queer|lgbtq?|trans|nonbinary|enby|genderqueer)\b', 'queer', 15),
|
||||
(r'\b(they/them|she/her|he/him|xe/xem|any.?pronouns)\b', 'pronouns', 10),
|
||||
(r'\bblm\b', 'blm', 5),
|
||||
(r'\b(acab|1312)\b', 'acab', 5),
|
||||
|
||||
# tech values
|
||||
(r'\b(privacy|surveillance|anti.?surveillance)\b', 'privacy', 10),
|
||||
(r'\b(self.?host(?:ed|ing)?|homelab|home.?server)\b', 'selfhosted', 15),
|
||||
(r'\b(local.?first|offline.?first)\b', 'local_first', 15),
|
||||
(r'\b(decentralized?|federation|federated|fediverse)\b', 'decentralized', 10),
|
||||
(r'\b(foss|libre|open.?source|copyleft)\b', 'foss', 10),
|
||||
(r'\b(home.?assistant|home.?automation)\b', 'home_automation', 10),
|
||||
(r'\b(mesh|p2p|peer.?to.?peer)\b', 'p2p', 10),
|
||||
(r'\b(matrix|xmpp|irc)\b', 'federated_chat', 5),
|
||||
(r'\b(degoogle|de.?google)\b', 'degoogle', 10),
|
||||
|
||||
# location/availability
|
||||
(r'\b(seattle|portland|pnw|cascadia|pacific.?northwest)\b', 'pnw', 20),
|
||||
(r'\b(washington|oregon)\b', 'pnw_state', 10),
|
||||
(r'\b(remote|anywhere|relocate|looking.?to.?move)\b', 'remote', 10),
|
||||
|
||||
# anti-capitalism
|
||||
(r'\b(anti.?capitalis[tm]|post.?capitalis[tm]|degrowth)\b', 'anticapitalist', 10),
|
||||
|
||||
# neurodivergent (often overlaps with our values)
|
||||
(r'\b(neurodivergent|adhd|autistic|autism)\b', 'neurodivergent', 5),
|
||||
|
||||
# technical skills (bonus for builders)
|
||||
(r'\b(rust|go|python|typescript)\b', 'modern_lang', 3),
|
||||
(r'\b(linux|bsd|nixos)\b', 'unix', 3),
|
||||
(r'\b(kubernetes|docker|podman)\b', 'containers', 3),
|
||||
]
|
||||
|
||||
# negative signals - red flags
|
||||
NEGATIVE_PATTERNS = [
|
||||
(r'\b(qanon|maga|trump|wwg1wga)\b', 'maga', -50),
|
||||
(r'\b(covid.?hoax|plandemic|5g.?conspiracy)\b', 'conspiracy', -50),
|
||||
(r'\b(nwo|illuminati|deep.?state)\b', 'conspiracy', -30),
|
||||
(r'\b(anti.?vax|antivax)\b', 'antivax', -30),
|
||||
(r'\b(sovereign.?citizen)\b', 'sovcit', -40),
|
||||
(r'\b(crypto.?bro|web3|nft|blockchain|bitcoin|ethereum)\b', 'crypto', -15),
|
||||
(r'\b(conservative|republican)\b', 'conservative', -20),
|
||||
(r'\b(free.?speech.?absolutist)\b', 'freeze_peach', -20),
|
||||
]
|
||||
|
||||
# target topics for repo discovery
|
||||
TARGET_TOPICS = [
|
||||
'local-first', 'self-hosted', 'privacy', 'mesh-network',
|
||||
'cooperative', 'solarpunk', 'decentralized', 'p2p',
|
||||
'fediverse', 'activitypub', 'matrix-org', 'homeassistant',
|
||||
'esphome', 'open-source-hardware', 'right-to-repair',
|
||||
'mutual-aid', 'commons', 'degoogle', 'privacy-tools',
|
||||
]
|
||||
|
||||
# ecosystem repos - high signal contributors
|
||||
ECOSYSTEM_REPOS = [
|
||||
'home-assistant/core',
|
||||
'esphome/esphome',
|
||||
'matrix-org/synapse',
|
||||
'LemmyNet/lemmy',
|
||||
'mastodon/mastodon',
|
||||
'owncast/owncast',
|
||||
'nextcloud/server',
|
||||
'immich-app/immich',
|
||||
'jellyfin/jellyfin',
|
||||
'navidrome/navidrome',
|
||||
'paperless-ngx/paperless-ngx',
|
||||
'actualbudget/actual',
|
||||
'firefly-iii/firefly-iii',
|
||||
'logseq/logseq',
|
||||
'AppFlowy-IO/AppFlowy',
|
||||
'siyuan-note/siyuan',
|
||||
'anytype/anytype-ts',
|
||||
'calcom/cal.com',
|
||||
'plausible/analytics',
|
||||
'umami-software/umami',
|
||||
]
|
||||
|
||||
# aligned subreddits
|
||||
ALIGNED_SUBREDDITS = {
|
||||
'intentionalcommunity': 25,
|
||||
'cohousing': 25,
|
||||
'cooperatives': 20,
|
||||
'solarpunk': 20,
|
||||
'selfhosted': 15,
|
||||
'homeassistant': 15,
|
||||
'homelab': 10,
|
||||
'privacy': 15,
|
||||
'PrivacyGuides': 15,
|
||||
'degoogle': 15,
|
||||
'anticonsumption': 10,
|
||||
'Frugal': 5,
|
||||
'simpleliving': 5,
|
||||
'Seattle': 10,
|
||||
'Portland': 10,
|
||||
'cascadia': 15,
|
||||
'linux': 5,
|
||||
'opensource': 10,
|
||||
'FOSS': 10,
|
||||
}
|
||||
|
||||
# negative subreddits
|
||||
NEGATIVE_SUBREDDITS = [
|
||||
'conspiracy', 'conservative', 'walkaway', 'louderwithcrowder',
|
||||
'JordanPeterson', 'TimPool', 'NoNewNormal', 'LockdownSkepticism',
|
||||
]
|
||||
|
||||
# high-signal mastodon instances
|
||||
ALIGNED_INSTANCES = {
|
||||
'tech.lgbt': 20,
|
||||
'social.coop': 25,
|
||||
'fosstodon.org': 10,
|
||||
'hackers.town': 15,
|
||||
'hachyderm.io': 10,
|
||||
'infosec.exchange': 5,
|
||||
}
|
||||
|
||||
|
||||
def analyze_text(text):
|
||||
"""
|
||||
analyze text for signals
|
||||
returns: (score, signals_found, negative_signals)
|
||||
"""
|
||||
if not text:
|
||||
return 0, [], []
|
||||
|
||||
text = text.lower()
|
||||
score = 0
|
||||
signals = []
|
||||
negatives = []
|
||||
|
||||
for pattern, signal_name, points in POSITIVE_PATTERNS:
|
||||
if re.search(pattern, text, re.IGNORECASE):
|
||||
score += points
|
||||
signals.append(signal_name)
|
||||
|
||||
for pattern, signal_name, points in NEGATIVE_PATTERNS:
|
||||
if re.search(pattern, text, re.IGNORECASE):
|
||||
score += points # points are already negative
|
||||
negatives.append(signal_name)
|
||||
|
||||
return score, list(set(signals)), list(set(negatives))
|
||||
|
|
@ -1,255 +0,0 @@
|
|||
"""
|
||||
scoutd/twitter.py - twitter/x discovery via nitter instances
|
||||
|
||||
scrapes nitter (twitter frontend) to find users posting about aligned topics
|
||||
without needing twitter API access
|
||||
|
||||
nitter instances rotate to avoid rate limits
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import re
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
from .signals import analyze_text
|
||||
|
||||
HEADERS = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'twitter'
|
||||
|
||||
# nitter instances (rotate through these)
|
||||
NITTER_INSTANCES = [
|
||||
'nitter.privacydev.net',
|
||||
'nitter.poast.org',
|
||||
'nitter.woodland.cafe',
|
||||
'nitter.esmailelbob.xyz',
|
||||
]
|
||||
|
||||
# hashtags to search
|
||||
ALIGNED_HASHTAGS = [
|
||||
'selfhosted', 'homelab', 'homeassistant', 'foss', 'opensource',
|
||||
'privacy', 'solarpunk', 'cooperative', 'mutualaid', 'localfirst',
|
||||
'indieweb', 'smallweb', 'permacomputing', 'degrowth', 'techworkers',
|
||||
]
|
||||
|
||||
_current_instance_idx = 0
|
||||
|
||||
|
||||
def get_nitter_instance():
|
||||
"""get current nitter instance, rotate on failure"""
|
||||
global _current_instance_idx
|
||||
return NITTER_INSTANCES[_current_instance_idx % len(NITTER_INSTANCES)]
|
||||
|
||||
|
||||
def rotate_instance():
|
||||
"""switch to next nitter instance"""
|
||||
global _current_instance_idx
|
||||
_current_instance_idx += 1
|
||||
|
||||
|
||||
def _scrape_page(url, retries=3):
|
||||
"""scrape a nitter page with instance rotation"""
|
||||
for attempt in range(retries):
|
||||
instance = get_nitter_instance()
|
||||
full_url = url.replace('{instance}', instance)
|
||||
|
||||
# check cache
|
||||
cache_key = f"{full_url}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
if time.time() - data.get('_cached_at', 0) < 3600:
|
||||
return data.get('_html')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(2) # rate limit
|
||||
|
||||
try:
|
||||
resp = requests.get(full_url, headers=HEADERS, timeout=30)
|
||||
if resp.status_code == 200:
|
||||
cache_file.write_text(json.dumps({
|
||||
'_cached_at': time.time(),
|
||||
'_html': resp.text
|
||||
}))
|
||||
return resp.text
|
||||
elif resp.status_code in [429, 503]:
|
||||
print(f" nitter {instance} rate limited, rotating...")
|
||||
rotate_instance()
|
||||
else:
|
||||
print(f" nitter error: {resp.status_code}")
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f" nitter {instance} error: {e}")
|
||||
rotate_instance()
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def search_hashtag(hashtag):
|
||||
"""search for tweets with hashtag"""
|
||||
url = f"https://{{instance}}/search?q=%23{hashtag}&f=tweets"
|
||||
html = _scrape_page(url)
|
||||
if not html:
|
||||
return []
|
||||
|
||||
soup = BeautifulSoup(html, 'html.parser')
|
||||
tweets = []
|
||||
|
||||
for tweet_div in soup.select('.timeline-item'):
|
||||
try:
|
||||
username_elem = tweet_div.select_one('.username')
|
||||
content_elem = tweet_div.select_one('.tweet-content')
|
||||
fullname_elem = tweet_div.select_one('.fullname')
|
||||
|
||||
if username_elem and content_elem:
|
||||
username = username_elem.text.strip().lstrip('@')
|
||||
tweets.append({
|
||||
'username': username,
|
||||
'name': fullname_elem.text.strip() if fullname_elem else username,
|
||||
'content': content_elem.text.strip(),
|
||||
})
|
||||
except Exception as e:
|
||||
continue
|
||||
|
||||
return tweets
|
||||
|
||||
|
||||
def get_user_profile(username):
|
||||
"""get user profile from nitter"""
|
||||
url = f"https://{{instance}}/{username}"
|
||||
html = _scrape_page(url)
|
||||
if not html:
|
||||
return None
|
||||
|
||||
soup = BeautifulSoup(html, 'html.parser')
|
||||
|
||||
try:
|
||||
bio_elem = soup.select_one('.profile-bio')
|
||||
bio = bio_elem.text.strip() if bio_elem else ''
|
||||
|
||||
location_elem = soup.select_one('.profile-location')
|
||||
location = location_elem.text.strip() if location_elem else ''
|
||||
|
||||
website_elem = soup.select_one('.profile-website a')
|
||||
website = website_elem.get('href') if website_elem else ''
|
||||
|
||||
# get recent tweets for more signal
|
||||
tweets = []
|
||||
for tweet_div in soup.select('.timeline-item')[:10]:
|
||||
content_elem = tweet_div.select_one('.tweet-content')
|
||||
if content_elem:
|
||||
tweets.append(content_elem.text.strip())
|
||||
|
||||
return {
|
||||
'username': username,
|
||||
'bio': bio,
|
||||
'location': location,
|
||||
'website': website,
|
||||
'recent_tweets': tweets,
|
||||
}
|
||||
except Exception as e:
|
||||
print(f" error parsing {username}: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def analyze_twitter_user(username, profile=None):
|
||||
"""analyze a twitter user for alignment"""
|
||||
if not profile:
|
||||
profile = get_user_profile(username)
|
||||
|
||||
if not profile:
|
||||
return None
|
||||
|
||||
# collect text
|
||||
text_parts = [profile.get('bio', '')]
|
||||
text_parts.extend(profile.get('recent_tweets', []))
|
||||
|
||||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# twitter is noisy, lower base confidence
|
||||
confidence = 0.25
|
||||
if len(positive_signals) >= 3:
|
||||
confidence += 0.2
|
||||
if profile.get('website'):
|
||||
confidence += 0.1
|
||||
if len(profile.get('recent_tweets', [])) >= 5:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 0.7) # cap lower for twitter
|
||||
|
||||
reasons = []
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
|
||||
return {
|
||||
'platform': 'twitter',
|
||||
'username': username,
|
||||
'url': f"https://twitter.com/{username}",
|
||||
'name': profile.get('name', username),
|
||||
'bio': profile.get('bio'),
|
||||
'location': profile.get('location'),
|
||||
'score': text_score,
|
||||
'confidence': confidence,
|
||||
'signals': positive_signals,
|
||||
'negative_signals': negative_signals,
|
||||
'reasons': reasons,
|
||||
'contact': {
|
||||
'twitter': username,
|
||||
'website': profile.get('website'),
|
||||
},
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
}
|
||||
|
||||
|
||||
def scrape_twitter(db, limit_per_hashtag=50):
|
||||
"""full twitter scrape via nitter"""
|
||||
print("scoutd/twitter: starting scrape via nitter...")
|
||||
|
||||
all_users = {}
|
||||
|
||||
for hashtag in ALIGNED_HASHTAGS:
|
||||
print(f" #{hashtag}...")
|
||||
tweets = search_hashtag(hashtag)
|
||||
|
||||
for tweet in tweets[:limit_per_hashtag]:
|
||||
username = tweet.get('username')
|
||||
if username and username not in all_users:
|
||||
all_users[username] = {
|
||||
'username': username,
|
||||
'name': tweet.get('name'),
|
||||
'hashtags': [hashtag],
|
||||
}
|
||||
elif username:
|
||||
all_users[username]['hashtags'].append(hashtag)
|
||||
|
||||
print(f" found {len(tweets)} tweets")
|
||||
|
||||
# prioritize users in multiple hashtags
|
||||
multi_hashtag = {u: d for u, d in all_users.items() if len(d.get('hashtags', [])) >= 2}
|
||||
print(f" {len(multi_hashtag)} users in 2+ aligned hashtags")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
for username, data in list(multi_hashtag.items())[:100]: # limit to prevent rate limits
|
||||
try:
|
||||
result = analyze_twitter_user(username)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
if result['score'] >= 30:
|
||||
print(f" ★ @{username}: {result['score']} pts")
|
||||
except Exception as e:
|
||||
print(f" error on {username}: {e}")
|
||||
|
||||
print(f"scoutd/twitter: found {len(results)} aligned humans")
|
||||
return results
|
||||
|
|
@ -1,143 +0,0 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
setup priority user - add yourself to get matches
|
||||
|
||||
usage:
|
||||
python setup_user.py # interactive setup
|
||||
python setup_user.py --show # show your profile
|
||||
python setup_user.py --matches # show your matches
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
from db import Database
|
||||
from db.users import (init_users_table, add_priority_user, get_priority_users,
|
||||
get_priority_user_matches)
|
||||
|
||||
|
||||
def interactive_setup(db):
|
||||
"""interactive priority user setup"""
|
||||
print("=" * 60)
|
||||
print("connectd priority user setup")
|
||||
print("=" * 60)
|
||||
print("\nlink your profiles so connectd can find matches for YOU\n")
|
||||
|
||||
name = input("name: ").strip()
|
||||
email = input("email (for notifications): ").strip()
|
||||
github = input("github username (optional): ").strip() or None
|
||||
reddit = input("reddit username (optional): ").strip() or None
|
||||
mastodon = input("mastodon handle e.g. user@instance (optional): ").strip() or None
|
||||
lobsters = input("lobste.rs username (optional): ").strip() or None
|
||||
matrix = input("matrix id e.g. @user:matrix.org (optional): ").strip() or None
|
||||
location = input("location (e.g. seattle, remote): ").strip() or None
|
||||
|
||||
print("\nwhat are you interested in? (comma separated)")
|
||||
print("examples: self-hosting, cooperatives, solarpunk, home automation")
|
||||
interests_raw = input("interests: ").strip()
|
||||
interests = [i.strip() for i in interests_raw.split(',')] if interests_raw else []
|
||||
|
||||
print("\nwhat kind of people are you looking to connect with?")
|
||||
looking_for = input("looking for: ").strip() or None
|
||||
|
||||
user_data = {
|
||||
'name': name,
|
||||
'email': email,
|
||||
'github': github,
|
||||
'reddit': reddit,
|
||||
'mastodon': mastodon,
|
||||
'lobsters': lobsters,
|
||||
'matrix': matrix,
|
||||
'location': location,
|
||||
'interests': interests,
|
||||
'looking_for': looking_for,
|
||||
}
|
||||
|
||||
user_id = add_priority_user(db.conn, user_data)
|
||||
print(f"\n✓ added as priority user #{user_id}")
|
||||
print("connectd will now find matches for you")
|
||||
|
||||
|
||||
def show_profile(db):
|
||||
"""show current priority user profile"""
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
print("no priority users configured")
|
||||
print("run: python setup_user.py")
|
||||
return
|
||||
|
||||
for user in users:
|
||||
print("=" * 60)
|
||||
print(f"priority user #{user['id']}: {user['name']}")
|
||||
print("=" * 60)
|
||||
print(f"email: {user['email']}")
|
||||
if user['github']:
|
||||
print(f"github: {user['github']}")
|
||||
if user['reddit']:
|
||||
print(f"reddit: {user['reddit']}")
|
||||
if user['mastodon']:
|
||||
print(f"mastodon: {user['mastodon']}")
|
||||
if user['lobsters']:
|
||||
print(f"lobsters: {user['lobsters']}")
|
||||
if user['matrix']:
|
||||
print(f"matrix: {user['matrix']}")
|
||||
if user['location']:
|
||||
print(f"location: {user['location']}")
|
||||
if user['interests']:
|
||||
interests = json.loads(user['interests']) if isinstance(user['interests'], str) else user['interests']
|
||||
print(f"interests: {', '.join(interests)}")
|
||||
if user['looking_for']:
|
||||
print(f"looking for: {user['looking_for']}")
|
||||
|
||||
|
||||
def show_matches(db):
|
||||
"""show matches for priority user"""
|
||||
users = get_priority_users(db.conn)
|
||||
|
||||
if not users:
|
||||
print("no priority users configured")
|
||||
return
|
||||
|
||||
for user in users:
|
||||
print(f"\n=== matches for {user['name']} ===\n")
|
||||
|
||||
matches = get_priority_user_matches(db.conn, user['id'], limit=20)
|
||||
|
||||
if not matches:
|
||||
print("no matches yet - run the daemon to discover people")
|
||||
continue
|
||||
|
||||
for i, match in enumerate(matches, 1):
|
||||
print(f"{i}. {match['username']} ({match['platform']})")
|
||||
print(f" score: {match['overlap_score']:.0f}")
|
||||
print(f" url: {match['url']}")
|
||||
|
||||
reasons = match.get('overlap_reasons', '[]')
|
||||
if isinstance(reasons, str):
|
||||
reasons = json.loads(reasons)
|
||||
if reasons:
|
||||
print(f" why: {reasons[0] if reasons else ''}")
|
||||
print()
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='setup priority user')
|
||||
parser.add_argument('--show', action='store_true', help='show your profile')
|
||||
parser.add_argument('--matches', action='store_true', help='show your matches')
|
||||
args = parser.parse_args()
|
||||
|
||||
db = Database()
|
||||
init_users_table(db.conn)
|
||||
|
||||
if args.show:
|
||||
show_profile(db)
|
||||
elif args.matches:
|
||||
show_matches(db)
|
||||
else:
|
||||
interactive_setup(db)
|
||||
|
||||
db.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
|
@ -334,18 +334,24 @@ def determine_best_contact(human):
|
|||
"""
|
||||
determine best contact method based on WHERE THEY'RE MOST ACTIVE
|
||||
|
||||
uses activity-based selection from groq_draft module
|
||||
returns: (method, info, fallbacks)
|
||||
uses activity-based selection - ranks by user's actual usage
|
||||
"""
|
||||
from introd.groq_draft import determine_contact_method as activity_based_contact
|
||||
|
||||
method, info = activity_based_contact(human)
|
||||
method, info, fallbacks = activity_based_contact(human)
|
||||
|
||||
# convert github_issue info to dict format for delivery
|
||||
if method == 'github_issue' and isinstance(info, str) and '/' in info:
|
||||
parts = info.split('/', 1)
|
||||
return method, {'owner': parts[0], 'repo': parts[1]}
|
||||
def format_info(m, i):
|
||||
if m == 'github_issue' and isinstance(i, str) and '/' in i:
|
||||
parts = i.split('/', 1)
|
||||
return {'owner': parts[0], 'repo': parts[1]}
|
||||
return i
|
||||
|
||||
return method, info
|
||||
info = format_info(method, info)
|
||||
fallbacks = [(m, format_info(m, i)) for m, i in fallbacks]
|
||||
|
||||
return method, info, fallbacks
|
||||
|
||||
|
||||
def deliver_intro(match_data, intro_draft, dry_run=False):
|
||||
|
|
@ -362,8 +368,8 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
if already_contacted(recipient_id):
|
||||
return False, "already contacted", None
|
||||
|
||||
# determine contact method
|
||||
method, contact_info = determine_best_contact(recipient)
|
||||
# determine contact method with fallbacks
|
||||
method, contact_info, fallbacks = determine_best_contact(recipient)
|
||||
|
||||
log = load_delivery_log()
|
||||
result = {
|
||||
|
|
@ -423,9 +429,60 @@ def deliver_intro(match_data, intro_draft, dry_run=False):
|
|||
success = True
|
||||
error = "added to manual queue"
|
||||
|
||||
# if failed and we have fallbacks, try them
|
||||
if not success and fallbacks:
|
||||
for fallback_method, fallback_info in fallbacks:
|
||||
result['fallback_attempts'] = result.get('fallback_attempts', [])
|
||||
result['fallback_attempts'].append({
|
||||
'method': fallback_method,
|
||||
'contact_info': fallback_info
|
||||
})
|
||||
|
||||
fb_success = False
|
||||
fb_error = None
|
||||
|
||||
if fallback_method == 'email':
|
||||
subject = f"someone you might want to know - connectd"
|
||||
fb_success, fb_error = send_email(fallback_info, subject, intro_draft, dry_run)
|
||||
elif fallback_method == 'mastodon':
|
||||
fb_success, fb_error = send_mastodon_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'bluesky':
|
||||
fb_success, fb_error = send_bluesky_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'matrix':
|
||||
fb_success, fb_error = send_matrix_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'lemmy':
|
||||
from scoutd.lemmy import send_lemmy_dm
|
||||
fb_success, fb_error = send_lemmy_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'discord':
|
||||
from scoutd.discord import send_discord_dm
|
||||
fb_success, fb_error = send_discord_dm(fallback_info, intro_draft, dry_run)
|
||||
elif fallback_method == 'github_issue':
|
||||
owner = fallback_info.get('owner')
|
||||
repo = fallback_info.get('repo')
|
||||
title = "community introduction from connectd"
|
||||
github_body = f"""hey {recipient.get('name') or recipient.get('username')},
|
||||
|
||||
{intro_draft}
|
||||
|
||||
---
|
||||
*automated introduction from connectd*
|
||||
"""
|
||||
fb_success, fb_error = create_github_issue(owner, repo, title, github_body, dry_run)
|
||||
|
||||
if fb_success:
|
||||
success = True
|
||||
method = fallback_method
|
||||
contact_info = fallback_info
|
||||
error = None
|
||||
result['fallback_succeeded'] = fallback_method
|
||||
break
|
||||
else:
|
||||
result['fallback_attempts'][-1]['error'] = fb_error
|
||||
|
||||
# log result
|
||||
result['success'] = success
|
||||
result['error'] = error
|
||||
result['final_method'] = method
|
||||
|
||||
if success:
|
||||
log['sent'].append(result)
|
||||
|
|
|
|||
|
|
@ -1,10 +1,14 @@
|
|||
"""
|
||||
introd/draft.py - AI writes intro messages referencing both parties' work
|
||||
now with interest system links
|
||||
"""
|
||||
|
||||
import json
|
||||
|
||||
# intro template - transparent about being AI, neutral third party
|
||||
# base URL for connectd profiles
|
||||
CONNECTD_URL = "https://connectd.sudoxreboot.com"
|
||||
|
||||
# intro template - now with interest links
|
||||
INTRO_TEMPLATE = """hi {recipient_name},
|
||||
|
||||
i'm an AI that connects isolated builders working on similar things.
|
||||
|
|
@ -17,7 +21,8 @@ overlap: {overlap_summary}
|
|||
|
||||
thought you might benefit from knowing each other.
|
||||
|
||||
their work: {other_url}
|
||||
their profile: {profile_url}
|
||||
{interested_line}
|
||||
|
||||
no pitch. just connection. ignore if not useful.
|
||||
|
||||
|
|
@ -32,7 +37,7 @@ you: {recipient_summary}
|
|||
|
||||
overlap: {overlap_summary}
|
||||
|
||||
their work: {other_url}
|
||||
their profile: {profile_url}
|
||||
|
||||
no pitch, just connection.
|
||||
"""
|
||||
|
|
@ -51,12 +56,18 @@ def summarize_human(human_data):
|
|||
# signals/interests
|
||||
signals = human_data.get('signals', [])
|
||||
if isinstance(signals, str):
|
||||
try:
|
||||
signals = json.loads(signals)
|
||||
except:
|
||||
signals = []
|
||||
|
||||
# extra data
|
||||
extra = human_data.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
try:
|
||||
extra = json.loads(extra)
|
||||
except:
|
||||
extra = {}
|
||||
|
||||
# build summary based on available data
|
||||
topics = extra.get('topics', [])
|
||||
|
|
@ -103,7 +114,10 @@ def summarize_overlap(overlap_data):
|
|||
"""generate overlap summary"""
|
||||
reasons = overlap_data.get('overlap_reasons', [])
|
||||
if isinstance(reasons, str):
|
||||
try:
|
||||
reasons = json.loads(reasons)
|
||||
except:
|
||||
reasons = []
|
||||
|
||||
if reasons:
|
||||
return ' | '.join(reasons[:3])
|
||||
|
|
@ -116,12 +130,14 @@ def summarize_overlap(overlap_data):
|
|||
return "aligned values and interests"
|
||||
|
||||
|
||||
def draft_intro(match_data, recipient='a'):
|
||||
def draft_intro(match_data, recipient='a', recipient_token=None, interested_count=0):
|
||||
"""
|
||||
draft an intro message for a match
|
||||
|
||||
match_data: dict with human_a, human_b, overlap info
|
||||
recipient: 'a' or 'b' - who receives this intro
|
||||
recipient_token: token for the recipient (to track who clicked)
|
||||
interested_count: how many people are already interested in the recipient
|
||||
|
||||
returns: dict with draft text, channel, metadata
|
||||
"""
|
||||
|
|
@ -135,19 +151,37 @@ def draft_intro(match_data, recipient='a'):
|
|||
# get names
|
||||
recipient_name = recipient_human.get('name') or recipient_human.get('username', 'friend')
|
||||
other_name = other_human.get('name') or other_human.get('username', 'someone')
|
||||
other_username = other_human.get('username', '')
|
||||
|
||||
# generate summaries
|
||||
recipient_summary = summarize_human(recipient_human)
|
||||
other_summary = summarize_human(other_human)
|
||||
overlap_summary = summarize_overlap(match_data)
|
||||
|
||||
# other's url
|
||||
other_url = other_human.get('url', '')
|
||||
# build profile URL with token if available
|
||||
if other_username:
|
||||
profile_url = f"{CONNECTD_URL}/{other_username}"
|
||||
if recipient_token:
|
||||
profile_url += f"?t={recipient_token}"
|
||||
else:
|
||||
profile_url = other_human.get('url', '')
|
||||
|
||||
# interested line - tells them about their inbox
|
||||
interested_line = ''
|
||||
if recipient_token:
|
||||
interested_url = f"{CONNECTD_URL}/interested/{recipient_token}"
|
||||
if interested_count > 0:
|
||||
interested_line = f"\n{interested_count} people already want to meet you: {interested_url}"
|
||||
else:
|
||||
interested_line = f"\nbe the first to connect: {interested_url}"
|
||||
|
||||
# determine best channel
|
||||
contact = recipient_human.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
try:
|
||||
contact = json.loads(contact)
|
||||
except:
|
||||
contact = {}
|
||||
|
||||
channel = None
|
||||
channel_address = None
|
||||
|
|
@ -156,15 +190,12 @@ def draft_intro(match_data, recipient='a'):
|
|||
if contact.get('email'):
|
||||
channel = 'email'
|
||||
channel_address = contact['email']
|
||||
# github issue/discussion
|
||||
elif recipient_human.get('platform') == 'github':
|
||||
channel = 'github'
|
||||
channel_address = recipient_human.get('url')
|
||||
# mastodon DM
|
||||
elif recipient_human.get('platform') == 'mastodon':
|
||||
channel = 'mastodon'
|
||||
channel_address = recipient_human.get('username')
|
||||
# reddit message
|
||||
elif recipient_human.get('platform') == 'reddit':
|
||||
channel = 'reddit'
|
||||
channel_address = recipient_human.get('username')
|
||||
|
|
@ -180,12 +211,13 @@ def draft_intro(match_data, recipient='a'):
|
|||
|
||||
# render draft
|
||||
draft = template.format(
|
||||
recipient_name=recipient_name.split()[0] if recipient_name else 'friend', # first name only
|
||||
recipient_name=recipient_name.split()[0] if recipient_name else 'friend',
|
||||
recipient_summary=recipient_summary,
|
||||
other_name=other_name.split()[0] if other_name else 'someone',
|
||||
other_summary=other_summary,
|
||||
overlap_summary=overlap_summary,
|
||||
other_url=other_url,
|
||||
profile_url=profile_url,
|
||||
interested_line=interested_line,
|
||||
)
|
||||
|
||||
return {
|
||||
|
|
@ -196,15 +228,16 @@ def draft_intro(match_data, recipient='a'):
|
|||
'draft': draft,
|
||||
'overlap_score': match_data.get('overlap_score', 0),
|
||||
'match_id': match_data.get('id'),
|
||||
'recipient_token': recipient_token,
|
||||
}
|
||||
|
||||
|
||||
def draft_intros_for_match(match_data):
|
||||
def draft_intros_for_match(match_data, token_a=None, token_b=None, interested_a=0, interested_b=0):
|
||||
"""
|
||||
draft intros for both parties in a match
|
||||
returns list of two intro dicts
|
||||
"""
|
||||
intro_a = draft_intro(match_data, recipient='a')
|
||||
intro_b = draft_intro(match_data, recipient='b')
|
||||
intro_a = draft_intro(match_data, recipient='a', recipient_token=token_a, interested_count=interested_a)
|
||||
intro_b = draft_intro(match_data, recipient='b', recipient_token=token_b, interested_count=interested_b)
|
||||
|
||||
return [intro_a, intro_b]
|
||||
|
|
|
|||
|
|
@ -1,437 +1,435 @@
|
|||
"""
|
||||
introd/groq_draft.py - groq llama 4 maverick for smart intro drafting
|
||||
|
||||
uses groq api to generate personalized, natural intro messages
|
||||
that don't sound like ai-generated slop
|
||||
connectd - groq message drafting
|
||||
reads soul from file, uses as guideline for llm to personalize
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import requests
|
||||
from datetime import datetime
|
||||
from groq import Groq
|
||||
|
||||
GROQ_API_KEY = os.environ.get('GROQ_API_KEY', '')
|
||||
GROQ_API_URL = 'https://api.groq.com/openai/v1/chat/completions'
|
||||
MODEL = os.environ.get('GROQ_MODEL', 'llama-3.1-70b-versatile')
|
||||
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
|
||||
GROQ_MODEL = os.getenv("GROQ_MODEL", "llama-3.3-70b-versatile")
|
||||
|
||||
client = Groq(api_key=GROQ_API_KEY) if GROQ_API_KEY else None
|
||||
|
||||
def determine_contact_method(human):
|
||||
"""
|
||||
determine best contact method based on WHERE THEY'RE MOST ACTIVE
|
||||
|
||||
don't use fixed hierarchy - analyze activity per platform:
|
||||
- count posts/commits/activity
|
||||
- weight by recency (last 30 days matters more)
|
||||
- contact them where they already are
|
||||
- fall back to email only if no social activity
|
||||
"""
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
extra = human.get('extra', {})
|
||||
if isinstance(extra, str):
|
||||
extra = json.loads(extra) if extra else {}
|
||||
|
||||
# handle nested extra.extra from old save format
|
||||
if 'extra' in extra and isinstance(extra['extra'], dict):
|
||||
extra = {**extra, **extra['extra']}
|
||||
|
||||
contact = human.get('contact', {})
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact) if contact else {}
|
||||
|
||||
# collect activity scores per platform
|
||||
activity_scores = {}
|
||||
now = datetime.now()
|
||||
thirty_days_ago = now - timedelta(days=30)
|
||||
ninety_days_ago = now - timedelta(days=90)
|
||||
|
||||
# github activity
|
||||
github_username = human.get('username') if human.get('platform') == 'github' else extra.get('github')
|
||||
if github_username:
|
||||
github_score = 0
|
||||
top_repos = extra.get('top_repos', [])
|
||||
|
||||
for repo in top_repos:
|
||||
# recent commits weight more
|
||||
pushed_at = repo.get('pushed_at', '')
|
||||
if pushed_at:
|
||||
# load soul from file (guideline, not script)
|
||||
SOUL_PATH = os.getenv("SOUL_PATH", "/app/soul.txt")
|
||||
def load_soul():
|
||||
try:
|
||||
push_date = datetime.fromisoformat(pushed_at.replace('Z', '+00:00')).replace(tzinfo=None)
|
||||
if push_date > thirty_days_ago:
|
||||
github_score += 10 # very recent
|
||||
elif push_date > ninety_days_ago:
|
||||
github_score += 5 # somewhat recent
|
||||
else:
|
||||
github_score += 1 # old but exists
|
||||
with open(SOUL_PATH, 'r') as f:
|
||||
return f.read().strip()
|
||||
except:
|
||||
github_score += 1
|
||||
return None
|
||||
|
||||
# stars indicate engagement
|
||||
github_score += min(repo.get('stars', 0) // 10, 5)
|
||||
SIGNATURE_HTML = """
|
||||
<div style="margin-top: 24px; padding-top: 16px; border-top: 1px solid #333;">
|
||||
<div style="margin-bottom: 12px;">
|
||||
<a href="https://github.com/sudoxnym/connectd" style="color: #8b5cf6; text-decoration: none; font-size: 14px;">github.com/sudoxnym/connectd</a>
|
||||
<span style="color: #666; font-size: 12px; margin-left: 8px;">(main repo)</span>
|
||||
</div>
|
||||
<div style="display: flex; gap: 16px; align-items: center;">
|
||||
<a href="https://github.com/connectd-daemon" title="GitHub" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg>
|
||||
</a>
|
||||
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M23.268 5.313c-.35-2.578-2.617-4.61-5.304-5.004C17.51.242 15.792 0 11.813 0h-.03c-3.98 0-4.835.242-5.288.309C3.882.692 1.496 2.518.917 5.127.64 6.412.61 7.837.661 9.143c.074 1.874.088 3.745.26 5.611.118 1.24.325 2.47.62 3.68.55 2.237 2.777 4.098 4.96 4.857 2.336.792 4.849.923 7.256.38.265-.061.527-.132.786-.213.585-.184 1.27-.39 1.774-.753a.057.057 0 0 0 .023-.043v-1.809a.052.052 0 0 0-.02-.041.053.053 0 0 0-.046-.01 20.282 20.282 0 0 1-4.709.545c-2.73 0-3.463-1.284-3.674-1.818a5.593 5.593 0 0 1-.319-1.433.053.053 0 0 1 .066-.054c1.517.363 3.072.546 4.632.546.376 0 .75 0 1.125-.01 1.57-.044 3.224-.124 4.768-.422.038-.008.077-.015.11-.024 2.435-.464 4.753-1.92 4.989-5.604.008-.145.03-1.52.03-1.67.002-.512.167-3.63-.024-5.545zm-3.748 9.195h-2.561V8.29c0-1.309-.55-1.976-1.67-1.976-1.23 0-1.846.79-1.846 2.35v3.403h-2.546V8.663c0-1.56-.617-2.35-1.848-2.35-1.112 0-1.668.668-1.67 1.977v6.218H4.822V8.102c0-1.31.337-2.35 1.011-3.12.696-.77 1.608-1.164 2.74-1.164 1.311 0 2.302.5 2.962 1.498l.638 1.06.638-1.06c.66-.999 1.65-1.498 2.96-1.498 1.13 0 2.043.395 2.74 1.164.675.77 1.012 1.81 1.012 3.12z"/></svg>
|
||||
</a>
|
||||
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M5.202 2.857C7.954 4.922 10.913 9.11 12 11.358c1.087-2.247 4.046-6.436 6.798-8.501C20.783 1.366 24 .213 24 3.883c0 .732-.42 6.156-.667 7.037-.856 3.061-3.978 3.842-6.755 3.37 4.854.826 6.089 3.562 3.422 6.299-5.065 5.196-7.28-1.304-7.847-2.97-.104-.305-.152-.448-.153-.327 0-.121-.05.022-.153.327-.568 1.666-2.782 8.166-7.847 2.97-2.667-2.737-1.432-5.473 3.422-6.3-2.777.473-5.899-.308-6.755-3.369C.42 10.04 0 4.615 0 3.883c0-3.67 3.217-2.517 5.202-1.026"/></svg>
|
||||
</a>
|
||||
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M2.9595 4.2228a3.9132 3.9132 0 0 0-.332.019c-.8781.1012-1.67.5699-2.155 1.3862-.475.8-.5922 1.6809-.35 2.4971.2421.8162.8297 1.5575 1.6982 2.1449.0053.0035.0106.0076.0163.0114.746.4498 1.492.7431 2.2877.8994-.02.3318-.0272.6689-.006 1.0181.0634 1.0432.4368 2.0006.996 2.8492l-2.0061.8189a.4163.4163 0 0 0-.2276.2239.416.416 0 0 0 .0879.455.415.415 0 0 0 .2941.1231.4156.4156 0 0 0 .1595-.0312l2.2093-.9035c.408.4859.8695.9315 1.3723 1.318.0196.0151.0407.0264.0603.0423l-1.2918 1.7103a.416.416 0 0 0 .664.501l1.314-1.7385c.7185.4548 1.4782.7927 2.2294 1.0242.3833.7209 1.1379 1.1871 2.0202 1.1871.8907 0 1.6442-.501 2.0242-1.2072.744-.2347 1.4959-.5729 2.2073-1.0262l1.332 1.7606a.4157.4157 0 0 0 .7439-.1936.4165.4165 0 0 0-.0799-.3074l-1.3099-1.7345c.0083-.0075.0178-.0113.0261-.0188.4968-.3803.9549-.8175 1.3622-1.2939l2.155.8794a.4156.4156 0 0 0 .5412-.2276.4151.4151 0 0 0-.2273-.5432l-1.9438-.7928c.577-.8538.9697-1.8183 1.0504-2.8693.0268-.3507.0242-.6914.0079-1.0262.7905-.1572 1.5321-.4502 2.2737-.8974.0053-.0033.011-.0076.0163-.0113.8684-.5874 1.456-1.3287 1.6982-2.145.2421-.8161.125-1.697-.3501-2.497-.4849-.8163-1.2768-1.2852-2.155-1.3863a3.2175 3.2175 0 0 0-.332-.0189c-.7852-.0151-1.6231.229-2.4286.6942-.5926.342-1.1252.867-1.5433 1.4387-1.1699-.6703-2.6923-1.0476-4.5635-1.0785a15.5768 15.5768 0 0 0-.5111 0c-2.085.034-3.7537.43-5.0142 1.1449-.0033-.0038-.0045-.0114-.008-.0152-.4233-.5916-.973-1.1365-1.5835-1.489-.8055-.465-1.6434-.7083-2.4286-.6941Zm.2858.7365c.5568.042 1.1696.2358 1.7787.5875.485.28.9757.7554 1.346 1.2696a5.6875 5.6875 0 0 0-.4969.4085c-.9201.8516-1.4615 1.9597-1.668 3.2335-.6809-.1402-1.3183-.3945-1.984-.7948-.7553-.5128-1.2159-1.1225-1.4004-1.7445-.1851-.624-.1074-1.2712.2776-1.9196.3743-.63.9275-.9534 1.6118-1.0322a2.796 2.796 0 0 1 .5352-.0076Zm17.5094 0a2.797 2.797 0 0 1 .5353.0075c.6842.0786 1.2374.4021 1.6117 1.0322.385.6484.4627 1.2957.2776 1.9196-.1845.622-.645 1.2317-1.4004 1.7445-.6578.3955-1.2881.6472-1.9598.7888-.1942-1.2968-.7375-2.4338-1.666-3.302a5.5639 5.5639 0 0 0-.4709-.3923c.3645-.49.8287-.9428 1.2938-1.2113.6091-.3515 1.2219-.5454 1.7787-.5875ZM12.006 6.0036a14.832 14.832 0 0 1 .487 0c2.3901.0393 4.0848.67 5.1631 1.678 1.1501 1.0754 1.6423 2.6006 1.499 4.467-.1311 1.7079-1.2203 3.2281-2.652 4.324-.694.5313-1.4626.9354-2.2254 1.2294.0031-.0453.014-.0888.014-.1349.0029-1.1964-.9313-2.2133-2.2918-2.2133-1.3606 0-2.3222 1.0154-2.2918 2.2213.0013.0507.014.0972.0181.1471-.781-.2933-1.5696-.7013-2.2777-1.2456-1.4239-1.0945-2.4997-2.6129-2.6037-4.322-.1129-1.8567.3778-3.3382 1.5212-4.3965C7.5094 6.7 9.352 6.047 12.006 6.0036Zm-3.6419 6.8291c-.6053 0-1.0966.4903-1.0966 1.0966 0 .6063.4913 1.0986 1.0966 1.0986s1.0966-.4923 1.0966-1.0986c0-.6063-.4913-1.0966-1.0966-1.0966zm7.2819.0113c-.5998 0-1.0866.4859-1.0866 1.0866s.4868 1.0885 1.0866 1.0885c.5997 0 1.0865-.4878 1.0865-1.0885s-.4868-1.0866-1.0865-1.0866zM12 16.0835c1.0237 0 1.5654.638 1.5634 1.4829-.0018.7849-.6723 1.485-1.5634 1.485-.9167 0-1.54-.5629-1.5634-1.493-.0212-.8347.5397-1.4749 1.5634-1.4749Z"/></svg>
|
||||
</a>
|
||||
<a href="https://discord.gg/connectd" title="Discord" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M20.317 4.3698a19.7913 19.7913 0 00-4.8851-1.5152.0741.0741 0 00-.0785.0371c-.211.3753-.4447.8648-.6083 1.2495-1.8447-.2762-3.68-.2762-5.4868 0-.1636-.3933-.4058-.8742-.6177-1.2495a.077.077 0 00-.0785-.037 19.7363 19.7363 0 00-4.8852 1.515.0699.0699 0 00-.0321.0277C.5334 9.0458-.319 13.5799.0992 18.0578a.0824.0824 0 00.0312.0561c2.0528 1.5076 4.0413 2.4228 5.9929 3.0294a.0777.0777 0 00.0842-.0276c.4616-.6304.8731-1.2952 1.226-1.9942a.076.076 0 00-.0416-.1057c-.6528-.2476-1.2743-.5495-1.8722-.8923a.077.077 0 01-.0076-.1277c.1258-.0943.2517-.1923.3718-.2914a.0743.0743 0 01.0776-.0105c3.9278 1.7933 8.18 1.7933 12.0614 0a.0739.0739 0 01.0785.0095c.1202.099.246.1981.3728.2924a.077.077 0 01-.0066.1276 12.2986 12.2986 0 01-1.873.8914.0766.0766 0 00-.0407.1067c.3604.698.7719 1.3628 1.225 1.9932a.076.076 0 00.0842.0286c1.961-.6067 3.9495-1.5219 6.0023-3.0294a.077.077 0 00.0313-.0552c.5004-5.177-.8382-9.6739-3.5485-13.6604a.061.061 0 00-.0312-.0286zM8.02 15.3312c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9555-2.4189 2.157-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.9555 2.4189-2.1569 2.4189zm7.9748 0c-1.1825 0-2.1569-1.0857-2.1569-2.419 0-1.3332.9554-2.4189 2.1569-2.4189 1.2108 0 2.1757 1.0952 2.1568 2.419 0 1.3332-.946 2.4189-2.1568 2.4189Z"/></svg>
|
||||
</a>
|
||||
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M.632.55v22.9H2.28V24H0V0h2.28v.55zm7.043 7.26v1.157h.033c.309-.443.683-.784 1.117-1.024.433-.245.936-.365 1.5-.365.54 0 1.033.107 1.481.314.448.208.785.582 1.02 1.108.254-.374.6-.706 1.034-.992.434-.287.95-.43 1.546-.43.453 0 .872.056 1.26.167.388.11.716.286.993.53.276.245.489.559.646.951.152.392.23.863.23 1.417v5.728h-2.349V11.52c0-.286-.01-.559-.032-.812a1.755 1.755 0 0 0-.18-.66 1.106 1.106 0 0 0-.438-.448c-.194-.11-.457-.166-.785-.166-.332 0-.6.064-.803.189a1.38 1.38 0 0 0-.48.499 1.946 1.946 0 0 0-.231.696 5.56 5.56 0 0 0-.06.785v4.768h-2.35v-4.8c0-.254-.004-.503-.018-.752a2.074 2.074 0 0 0-.143-.688 1.052 1.052 0 0 0-.415-.503c-.194-.125-.476-.19-.854-.19-.111 0-.259.024-.439.074-.18.051-.36.143-.53.282-.171.138-.319.337-.439.595-.12.259-.18.6-.18 1.02v4.966H5.46V7.81zm15.693 15.64V.55H21.72V0H24v24h-2.28v-.55z"/></svg>
|
||||
</a>
|
||||
<a href="https://reddit.com/r/connectd" title="Reddit" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 0C5.373 0 0 5.373 0 12c0 3.314 1.343 6.314 3.515 8.485l-2.286 2.286C.775 23.225 1.097 24 1.738 24H12c6.627 0 12-5.373 12-12S18.627 0 12 0Zm4.388 3.199c1.104 0 1.999.895 1.999 1.999 0 1.105-.895 2-1.999 2-.946 0-1.739-.657-1.947-1.539v.002c-1.147.162-2.032 1.15-2.032 2.341v.007c1.776.067 3.4.567 4.686 1.363.473-.363 1.064-.58 1.707-.58 1.547 0 2.802 1.254 2.802 2.802 0 1.117-.655 2.081-1.601 2.531-.088 3.256-3.637 5.876-7.997 5.876-4.361 0-7.905-2.617-7.998-5.87-.954-.447-1.614-1.415-1.614-2.538 0-1.548 1.255-2.802 2.803-2.802.645 0 1.239.218 1.712.585 1.275-.79 2.881-1.291 4.64-1.365v-.01c0-1.663 1.263-3.034 2.88-3.207.188-.911.993-1.595 1.959-1.595Zm-8.085 8.376c-.784 0-1.459.78-1.506 1.797-.047 1.016.64 1.429 1.426 1.429.786 0 1.371-.369 1.418-1.385.047-1.017-.553-1.841-1.338-1.841Zm7.406 0c-.786 0-1.385.824-1.338 1.841.047 1.017.634 1.385 1.418 1.385.785 0 1.473-.413 1.426-1.429-.046-1.017-.721-1.797-1.506-1.797Zm-3.703 4.013c-.974 0-1.907.048-2.77.135-.147.015-.241.168-.183.305.483 1.154 1.622 1.964 2.953 1.964 1.33 0 2.47-.81 2.953-1.964.057-.137-.037-.29-.184-.305-.863-.087-1.795-.135-2.769-.135Z"/></svg>
|
||||
</a>
|
||||
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color: #888; text-decoration: none;">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M1.5 8.67v8.58a3 3 0 003 3h15a3 3 0 003-3V8.67l-8.928 5.493a3 3 0 01-3.144 0L1.5 8.67z"/><path d="M22.5 6.908V6.75a3 3 0 00-3-3h-15a3 3 0 00-3 3v.158l9.714 5.978a1.5 1.5 0 001.572 0L22.5 6.908z"/></svg>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
# commit activity from deep scrape
|
||||
commit_count = extra.get('commit_count', 0)
|
||||
github_score += min(commit_count // 10, 20)
|
||||
SIGNATURE_PLAINTEXT = """
|
||||
---
|
||||
github.com/sudoxnym/connectd (main repo)
|
||||
|
||||
if github_score > 0:
|
||||
activity_scores['github_issue'] = {
|
||||
'score': github_score,
|
||||
'info': f"{github_username}/{top_repos[0]['name']}" if top_repos else github_username
|
||||
}
|
||||
|
||||
# mastodon activity
|
||||
mastodon_handle = extra.get('mastodon') or contact.get('mastodon')
|
||||
if mastodon_handle:
|
||||
mastodon_score = 0
|
||||
statuses_count = extra.get('mastodon_statuses', 0) or human.get('statuses_count', 0)
|
||||
|
||||
# high post count = active user
|
||||
if statuses_count > 1000:
|
||||
mastodon_score += 30
|
||||
elif statuses_count > 500:
|
||||
mastodon_score += 20
|
||||
elif statuses_count > 100:
|
||||
mastodon_score += 10
|
||||
elif statuses_count > 0:
|
||||
mastodon_score += 5
|
||||
|
||||
# platform bonus for fediverse (values-aligned)
|
||||
mastodon_score += 10
|
||||
|
||||
# bonus if handle was discovered via rel="me" or similar verification
|
||||
# (having a handle linked from their website = they want to be contacted there)
|
||||
handles = extra.get('handles', {})
|
||||
if handles.get('mastodon') == mastodon_handle:
|
||||
mastodon_score += 15 # verified handle bonus
|
||||
|
||||
if mastodon_score > 0:
|
||||
activity_scores['mastodon'] = {'score': mastodon_score, 'info': mastodon_handle}
|
||||
|
||||
# bluesky activity
|
||||
bluesky_handle = extra.get('bluesky') or contact.get('bluesky')
|
||||
if bluesky_handle:
|
||||
bluesky_score = 0
|
||||
posts_count = extra.get('bluesky_posts', 0) or human.get('posts_count', 0)
|
||||
|
||||
if posts_count > 500:
|
||||
bluesky_score += 25
|
||||
elif posts_count > 100:
|
||||
bluesky_score += 15
|
||||
elif posts_count > 0:
|
||||
bluesky_score += 5
|
||||
|
||||
# newer platform, slightly lower weight
|
||||
bluesky_score += 5
|
||||
|
||||
if bluesky_score > 0:
|
||||
activity_scores['bluesky'] = {'score': bluesky_score, 'info': bluesky_handle}
|
||||
|
||||
# twitter activity
|
||||
twitter_handle = extra.get('twitter') or contact.get('twitter')
|
||||
if twitter_handle:
|
||||
twitter_score = 0
|
||||
tweets_count = extra.get('twitter_tweets', 0)
|
||||
|
||||
if tweets_count > 1000:
|
||||
twitter_score += 20
|
||||
elif tweets_count > 100:
|
||||
twitter_score += 10
|
||||
elif tweets_count > 0:
|
||||
twitter_score += 5
|
||||
|
||||
# if we found them via twitter hashtags, they're active there
|
||||
if human.get('platform') == 'twitter':
|
||||
twitter_score += 15
|
||||
|
||||
if twitter_score > 0:
|
||||
activity_scores['twitter'] = {'score': twitter_score, 'info': twitter_handle}
|
||||
|
||||
# NOTE: reddit is DISCOVERY ONLY, not a contact method
|
||||
# we find users on reddit but reach out via their external links (github, mastodon, etc.)
|
||||
# reddit-only users go to manual_queue for review
|
||||
|
||||
# lobsters activity
|
||||
lobsters_username = extra.get('lobsters') or contact.get('lobsters')
|
||||
if lobsters_username or human.get('platform') == 'lobsters':
|
||||
lobsters_score = 0
|
||||
lobsters_username = lobsters_username or human.get('username')
|
||||
|
||||
karma = extra.get('lobsters_karma', 0) or human.get('karma', 0)
|
||||
|
||||
# lobsters is invite-only, high signal
|
||||
lobsters_score += 15
|
||||
|
||||
if karma > 100:
|
||||
lobsters_score += 15
|
||||
elif karma > 50:
|
||||
lobsters_score += 10
|
||||
elif karma > 0:
|
||||
lobsters_score += 5
|
||||
|
||||
if lobsters_score > 0:
|
||||
activity_scores['lobsters'] = {'score': lobsters_score, 'info': lobsters_username}
|
||||
|
||||
# matrix activity
|
||||
matrix_id = extra.get('matrix') or contact.get('matrix')
|
||||
if matrix_id:
|
||||
matrix_score = 0
|
||||
|
||||
# matrix users are typically privacy-conscious and technical
|
||||
matrix_score += 15 # platform bonus for decentralized chat
|
||||
|
||||
# bonus if handle was discovered via rel="me" verification
|
||||
handles = extra.get('handles', {})
|
||||
if handles.get('matrix') == matrix_id:
|
||||
matrix_score += 10 # verified handle bonus
|
||||
|
||||
if matrix_score > 0:
|
||||
activity_scores['matrix'] = {'score': matrix_score, 'info': matrix_id}
|
||||
|
||||
# lemmy activity (fediverse)
|
||||
lemmy_username = human.get('username') if human.get('platform') == 'lemmy' else extra.get('lemmy')
|
||||
if lemmy_username:
|
||||
lemmy_score = 0
|
||||
|
||||
# lemmy is fediverse - high values alignment
|
||||
lemmy_score += 20 # fediverse platform bonus
|
||||
|
||||
post_count = extra.get('post_count', 0)
|
||||
comment_count = extra.get('comment_count', 0)
|
||||
|
||||
if post_count > 100:
|
||||
lemmy_score += 15
|
||||
elif post_count > 50:
|
||||
lemmy_score += 10
|
||||
elif post_count > 10:
|
||||
lemmy_score += 5
|
||||
|
||||
if comment_count > 500:
|
||||
lemmy_score += 10
|
||||
elif comment_count > 100:
|
||||
lemmy_score += 5
|
||||
|
||||
if lemmy_score > 0:
|
||||
activity_scores['lemmy'] = {'score': lemmy_score, 'info': lemmy_username}
|
||||
|
||||
# pick highest activity platform
|
||||
if activity_scores:
|
||||
best_platform = max(activity_scores.items(), key=lambda x: x[1]['score'])
|
||||
return best_platform[0], best_platform[1]['info']
|
||||
|
||||
# fall back to email ONLY if no social activity detected
|
||||
email = extra.get('email') or contact.get('email')
|
||||
# also check emails list
|
||||
if not email:
|
||||
emails = extra.get('emails') or contact.get('emails') or []
|
||||
for e in emails:
|
||||
if e and '@' in e and 'noreply' not in e.lower():
|
||||
email = e
|
||||
break
|
||||
|
||||
if email and '@' in email and 'noreply' not in email.lower():
|
||||
return 'email', email
|
||||
|
||||
# last resort: manual
|
||||
return 'manual', None
|
||||
github: github.com/connectd-daemon
|
||||
mastodon: @connectd@mastodon.sudoxreboot.com
|
||||
bluesky: connectd.bsky.social
|
||||
lemmy: lemmy.sudoxreboot.com/c/connectd
|
||||
discord: discord.gg/connectd
|
||||
matrix: @connectd:sudoxreboot.com
|
||||
reddit: reddit.com/r/connectd
|
||||
email: connectd@sudoxreboot.com
|
||||
"""
|
||||
|
||||
|
||||
def draft_intro_with_llm(match_data, recipient='a', dry_run=False):
|
||||
def draft_intro_with_llm(match_data: dict, recipient: str = 'a', dry_run: bool = True, recipient_token: str = None, interested_count: int = 0):
|
||||
"""
|
||||
use groq llama 4 maverick to draft a personalized intro
|
||||
draft an intro message using groq llm.
|
||||
|
||||
match_data should contain:
|
||||
- human_a: the first person
|
||||
- human_b: the second person
|
||||
- overlap_score: numeric score
|
||||
- overlap_reasons: list of why they match
|
||||
args:
|
||||
match_data: dict with human_a, human_b, overlap_score, overlap_reasons
|
||||
recipient: 'a' or 'b' - who receives the message
|
||||
dry_run: if True, preview mode
|
||||
|
||||
recipient: 'a' or 'b' - who we're writing to
|
||||
returns:
|
||||
tuple (result_dict, error_string)
|
||||
result_dict has: subject, draft_html, draft_plain
|
||||
"""
|
||||
if not GROQ_API_KEY:
|
||||
if not client:
|
||||
return None, "GROQ_API_KEY not set"
|
||||
|
||||
# determine recipient and other person
|
||||
if recipient == 'a':
|
||||
to_person = match_data.get('human_a', {})
|
||||
other_person = match_data.get('human_b', {})
|
||||
else:
|
||||
to_person = match_data.get('human_b', {})
|
||||
other_person = match_data.get('human_a', {})
|
||||
|
||||
# build context
|
||||
to_name = to_person.get('name') or to_person.get('username', 'friend')
|
||||
other_name = other_person.get('name') or other_person.get('username', 'someone')
|
||||
|
||||
to_signals = to_person.get('signals', [])
|
||||
if isinstance(to_signals, str):
|
||||
to_signals = json.loads(to_signals) if to_signals else []
|
||||
|
||||
other_signals = other_person.get('signals', [])
|
||||
if isinstance(other_signals, str):
|
||||
other_signals = json.loads(other_signals) if other_signals else []
|
||||
|
||||
overlap_reasons = match_data.get('overlap_reasons', [])
|
||||
if isinstance(overlap_reasons, str):
|
||||
overlap_reasons = json.loads(overlap_reasons) if overlap_reasons else []
|
||||
|
||||
# parse extra data
|
||||
to_extra = to_person.get('extra', {})
|
||||
other_extra = other_person.get('extra', {})
|
||||
if isinstance(to_extra, str):
|
||||
to_extra = json.loads(to_extra) if to_extra else {}
|
||||
if isinstance(other_extra, str):
|
||||
other_extra = json.loads(other_extra) if other_extra else {}
|
||||
|
||||
# build profile summaries
|
||||
to_profile = f"""
|
||||
name: {to_name}
|
||||
platform: {to_person.get('platform', 'unknown')}
|
||||
bio: {to_person.get('bio') or 'no bio'}
|
||||
location: {to_person.get('location') or 'unknown'}
|
||||
signals: {', '.join(to_signals[:8])}
|
||||
repos: {len(to_extra.get('top_repos', []))} public repos
|
||||
languages: {', '.join(to_extra.get('languages', {}).keys())}
|
||||
"""
|
||||
|
||||
other_profile = f"""
|
||||
name: {other_name}
|
||||
platform: {other_person.get('platform', 'unknown')}
|
||||
bio: {other_person.get('bio') or 'no bio'}
|
||||
location: {other_person.get('location') or 'unknown'}
|
||||
signals: {', '.join(other_signals[:8])}
|
||||
repos: {len(other_extra.get('top_repos', []))} public repos
|
||||
languages: {', '.join(other_extra.get('languages', {}).keys())}
|
||||
url: {other_person.get('url', '')}
|
||||
"""
|
||||
|
||||
# build prompt
|
||||
system_prompt = """you are connectd, an ai that connects isolated builders who share values but don't know each other yet.
|
||||
|
||||
your job is to write a short, genuine intro message to one person about another person they might want to know.
|
||||
|
||||
rules:
|
||||
- be brief (3-5 sentences max)
|
||||
- be genuine, not salesy or fake
|
||||
- focus on WHY they might want to connect, not just WHAT they have in common
|
||||
- don't be cringe or use buzzwords
|
||||
- lowercase preferred (casual tone)
|
||||
- no emojis unless the person's profile suggests they'd like them
|
||||
- mention specific things from their profiles, not generic "you both like open source"
|
||||
- end with a simple invitation, not a hard sell
|
||||
- sign off as "- connectd" (lowercase)
|
||||
|
||||
bad examples:
|
||||
- "I noticed you're both passionate about..." (too formal)
|
||||
- "You two would be PERFECT for each other!" (too salesy)
|
||||
- "As a fellow privacy enthusiast..." (cringe)
|
||||
|
||||
good examples:
|
||||
- "hey, saw you're building X. there's someone else working on similar stuff in Y who might be interesting to know."
|
||||
- "you might want to check out Z's work on federated systems - similar approach to what you're doing with A."
|
||||
"""
|
||||
|
||||
user_prompt = f"""write an intro message to {to_name} about {other_name}.
|
||||
|
||||
RECIPIENT ({to_name}):
|
||||
{to_profile}
|
||||
|
||||
INTRODUCING ({other_name}):
|
||||
{other_profile}
|
||||
|
||||
WHY THEY MATCH (overlap score {match_data.get('overlap_score', 0)}):
|
||||
{', '.join(overlap_reasons[:5])}
|
||||
|
||||
write a short intro message. remember: lowercase, genuine, not salesy."""
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
GROQ_API_URL,
|
||||
headers={
|
||||
'Authorization': f'Bearer {GROQ_API_KEY}',
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
json={
|
||||
'model': MODEL,
|
||||
'messages': [
|
||||
{'role': 'system', 'content': system_prompt},
|
||||
{'role': 'user', 'content': user_prompt},
|
||||
],
|
||||
'temperature': 0.7,
|
||||
'max_tokens': 300,
|
||||
},
|
||||
timeout=30,
|
||||
human_a = match_data.get('human_a', {})
|
||||
human_b = match_data.get('human_b', {})
|
||||
reasons = match_data.get('overlap_reasons', [])
|
||||
|
||||
# recipient gets the message, about_person is who we're introducing them to
|
||||
if recipient == 'a':
|
||||
to_person = human_a
|
||||
about_person = human_b
|
||||
else:
|
||||
to_person = human_b
|
||||
about_person = human_a
|
||||
|
||||
to_name = to_person.get('username', 'friend')
|
||||
about_name = about_person.get('username', 'someone')
|
||||
about_bio = about_person.get('extra', {}).get('bio', '')
|
||||
|
||||
# extract contact info for about_person
|
||||
about_extra = about_person.get('extra', {})
|
||||
if isinstance(about_extra, str):
|
||||
import json as _json
|
||||
about_extra = _json.loads(about_extra) if about_extra else {}
|
||||
about_contact = about_person.get('contact', {})
|
||||
if isinstance(about_contact, str):
|
||||
about_contact = _json.loads(about_contact) if about_contact else {}
|
||||
|
||||
# build contact link for about_person
|
||||
about_platform = about_person.get('platform', '')
|
||||
about_username = about_person.get('username', '')
|
||||
contact_link = None
|
||||
if about_platform == 'mastodon' and about_username:
|
||||
if '@' in about_username:
|
||||
parts = about_username.split('@')
|
||||
if len(parts) >= 2:
|
||||
contact_link = f"https://{parts[1]}/@{parts[0]}"
|
||||
elif about_platform == 'github' and about_username:
|
||||
contact_link = f"https://github.com/{about_username}"
|
||||
elif about_extra.get('mastodon') or about_contact.get('mastodon'):
|
||||
handle = about_extra.get('mastodon') or about_contact.get('mastodon')
|
||||
if '@' in handle:
|
||||
parts = handle.lstrip('@').split('@')
|
||||
if len(parts) >= 2:
|
||||
contact_link = f"https://{parts[1]}/@{parts[0]}"
|
||||
elif about_extra.get('github') or about_contact.get('github'):
|
||||
contact_link = f"https://github.com/{about_extra.get('github') or about_contact.get('github')}"
|
||||
elif about_extra.get('email'):
|
||||
contact_link = about_extra['email']
|
||||
elif about_contact.get('email'):
|
||||
contact_link = about_contact['email']
|
||||
elif about_extra.get('website'):
|
||||
contact_link = about_extra['website']
|
||||
elif about_extra.get('external_links', {}).get('website'):
|
||||
contact_link = about_extra['external_links']['website']
|
||||
elif about_extra.get('extra', {}).get('website'):
|
||||
contact_link = about_extra['extra']['website']
|
||||
elif about_platform == 'reddit' and about_username:
|
||||
contact_link = f"reddit.com/u/{about_username}"
|
||||
|
||||
if not contact_link:
|
||||
contact_link = f"github.com/{about_username}" if about_username else "reach out via connectd"
|
||||
|
||||
# skip if no real contact method (just reddit or generic)
|
||||
if contact_link.startswith('reddit.com') or contact_link == "reach out via connectd" or 'stackblitz' in contact_link:
|
||||
return None, f"no real contact info for {about_name} - skipping draft"
|
||||
|
||||
# format the shared factors naturally
|
||||
if reasons:
|
||||
factor = ', '.join(reasons[:3]) if len(reasons) > 1 else reasons[0]
|
||||
else:
|
||||
factor = "shared values and interests"
|
||||
|
||||
# load soul as guideline
|
||||
soul = load_soul()
|
||||
if not soul:
|
||||
return None, "could not load soul file"
|
||||
|
||||
# build the prompt - soul is GUIDELINE not script
|
||||
prompt = f"""you are connectd, a daemon that finds isolated builders and connects them.
|
||||
|
||||
write a personal message TO {to_name} telling them about {about_name}.
|
||||
|
||||
here is the soul/spirit of what connectd is about - use this as a GUIDELINE for tone and message, NOT as a script to copy verbatim:
|
||||
|
||||
---
|
||||
{soul}
|
||||
---
|
||||
|
||||
key facts for this message:
|
||||
- recipient: {to_name}
|
||||
- introducing them to: {about_name}
|
||||
- their shared interests/values: {factor}
|
||||
- about {about_name}: {about_bio if about_bio else 'a builder like you'}
|
||||
- HOW TO REACH {about_name}: {contact_link}
|
||||
|
||||
RULES:
|
||||
1. say their name ONCE at start, then use "you"
|
||||
2. MUST include how to reach {about_name}: {contact_link}
|
||||
3. lowercase, raw, emotional - follow the soul
|
||||
4. end with the contact link
|
||||
|
||||
return ONLY the message body. signature is added separately."""
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model=GROQ_MODEL,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=0.6,
|
||||
max_tokens=1200
|
||||
)
|
||||
|
||||
if response.status_code != 200:
|
||||
return None, f"groq api error: {response.status_code} - {response.text}"
|
||||
body = response.choices[0].message.content.strip()
|
||||
|
||||
data = response.json()
|
||||
draft = data['choices'][0]['message']['content'].strip()
|
||||
# generate subject
|
||||
subject_prompt = f"""generate a short, lowercase email subject for a message to {to_name} about connecting them with {about_name} over their shared interest in {factor}.
|
||||
|
||||
# determine contact method for recipient
|
||||
contact_method, contact_info = determine_contact_method(to_person)
|
||||
no corporate speak. no clickbait. raw and real.
|
||||
examples:
|
||||
- "found you, {to_name}"
|
||||
- "you're not alone"
|
||||
- "a door just opened"
|
||||
- "{to_name}, there's someone you should meet"
|
||||
|
||||
return ONLY the subject line."""
|
||||
|
||||
subject_response = client.chat.completions.create(
|
||||
model=GROQ_MODEL,
|
||||
messages=[{"role": "user", "content": subject_prompt}],
|
||||
temperature=0.9,
|
||||
max_tokens=50
|
||||
)
|
||||
|
||||
subject = subject_response.choices[0].message.content.strip().strip('"').strip("'")
|
||||
|
||||
# add profile link and interest section
|
||||
profile_url = f"https://connectd.sudoxreboot.com/{about_name}"
|
||||
if recipient_token:
|
||||
profile_url += f"?t={recipient_token}"
|
||||
|
||||
profile_section_html = f"""
|
||||
<div style="margin-top: 20px; padding: 16px; background: #2d1f3d; border: 1px solid #8b5cf6; border-radius: 8px;">
|
||||
<div style="color: #c792ea; font-size: 14px; margin-bottom: 8px;">here's the profile we built for {about_name}:</div>
|
||||
<a href="{profile_url}" style="color: #82aaff; font-size: 16px;">{profile_url}</a>
|
||||
</div>
|
||||
"""
|
||||
|
||||
profile_section_plain = f"""
|
||||
|
||||
---
|
||||
here's the profile we built for {about_name}:
|
||||
{profile_url}
|
||||
"""
|
||||
|
||||
# add interested section if recipient has people wanting to chat
|
||||
interest_section_html = ""
|
||||
interest_section_plain = ""
|
||||
if recipient_token and interested_count > 0:
|
||||
interest_url = f"https://connectd.sudoxreboot.com/interested/{recipient_token}"
|
||||
people_word = "person wants" if interested_count == 1 else "people want"
|
||||
interest_section_html = f"""
|
||||
<div style="margin-top: 12px; padding: 16px; background: #1f2d3d; border: 1px solid #0f8; border-radius: 8px;">
|
||||
<div style="color: #0f8; font-size: 14px;">{interested_count} {people_word} to chat with you:</div>
|
||||
<a href="{interest_url}" style="color: #82aaff; font-size: 14px;">{interest_url}</a>
|
||||
</div>
|
||||
"""
|
||||
interest_section_plain = f"""
|
||||
{interested_count} {people_word} to chat with you:
|
||||
{interest_url}
|
||||
"""
|
||||
|
||||
# format html
|
||||
draft_html = f"<div style='font-family: monospace; white-space: pre-wrap; color: #e0e0e0; background: #1a1a1a; padding: 20px;'>{body}</div>{profile_section_html}{interest_section_html}{SIGNATURE_HTML}"
|
||||
draft_plain = body + profile_section_plain + interest_section_plain + SIGNATURE_PLAINTEXT
|
||||
|
||||
return {
|
||||
'draft': draft,
|
||||
'model': MODEL,
|
||||
'to': to_name,
|
||||
'about': other_name,
|
||||
'overlap_score': match_data.get('overlap_score', 0),
|
||||
'contact_method': contact_method,
|
||||
'contact_info': contact_info,
|
||||
'generated_at': datetime.now().isoformat(),
|
||||
'subject': subject,
|
||||
'draft_html': draft_html,
|
||||
'draft_plain': draft_plain
|
||||
}, None
|
||||
|
||||
except Exception as e:
|
||||
return None, f"groq error: {str(e)}"
|
||||
return None, str(e)
|
||||
|
||||
|
||||
def draft_intro_batch(matches, dry_run=False):
|
||||
"""
|
||||
draft intros for multiple matches
|
||||
returns list of (match, intro_result, error) tuples
|
||||
"""
|
||||
results = []
|
||||
|
||||
for match in matches:
|
||||
# draft for both directions
|
||||
intro_a, err_a = draft_intro_with_llm(match, recipient='a', dry_run=dry_run)
|
||||
intro_b, err_b = draft_intro_with_llm(match, recipient='b', dry_run=dry_run)
|
||||
|
||||
results.append({
|
||||
'match': match,
|
||||
'intro_to_a': intro_a,
|
||||
'intro_to_b': intro_b,
|
||||
'errors': [err_a, err_b],
|
||||
})
|
||||
|
||||
return results
|
||||
# for backwards compat with old code
|
||||
def draft_message(person: dict, factor: str, platform: str = "email") -> dict:
|
||||
"""legacy function - wraps new api"""
|
||||
match_data = {
|
||||
'human_a': {'username': 'recipient'},
|
||||
'human_b': person,
|
||||
'overlap_reasons': [factor]
|
||||
}
|
||||
result, error = draft_intro_with_llm(match_data, recipient='a')
|
||||
if error:
|
||||
raise ValueError(error)
|
||||
return {
|
||||
'subject': result['subject'],
|
||||
'body_html': result['draft_html'],
|
||||
'body_plain': result['draft_plain']
|
||||
}
|
||||
|
||||
|
||||
def test_groq_connection():
|
||||
"""test that groq api is working"""
|
||||
if not GROQ_API_KEY:
|
||||
return False, "GROQ_API_KEY not set"
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
GROQ_API_URL,
|
||||
headers={
|
||||
'Authorization': f'Bearer {GROQ_API_KEY}',
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
json={
|
||||
'model': MODEL,
|
||||
'messages': [{'role': 'user', 'content': 'say "ok" and nothing else'}],
|
||||
'max_tokens': 10,
|
||||
},
|
||||
timeout=10,
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return True, "groq api working"
|
||||
if __name__ == "__main__":
|
||||
# test
|
||||
test_data = {
|
||||
'human_a': {'username': 'sudoxnym', 'extra': {'bio': 'building intentional communities'}},
|
||||
'human_b': {'username': 'testuser', 'extra': {'bio': 'home assistant enthusiast'}},
|
||||
'overlap_reasons': ['home-assistant', 'open source', 'community building']
|
||||
}
|
||||
result, error = draft_intro_with_llm(test_data, recipient='a')
|
||||
if error:
|
||||
print(f"error: {error}")
|
||||
else:
|
||||
return False, f"groq api error: {response.status_code}"
|
||||
print(f"subject: {result['subject']}")
|
||||
print(f"\nbody:\n{result['draft_plain']}")
|
||||
|
||||
except Exception as e:
|
||||
return False, f"groq connection error: {str(e)}"
|
||||
# contact method ranking - USAGE BASED
|
||||
# we rank by where the person is MOST ACTIVE, not by our preference
|
||||
|
||||
def determine_contact_method(human):
|
||||
"""
|
||||
determine ALL available contact methods, ranked by USER'S ACTIVITY.
|
||||
|
||||
looks at activity metrics to decide where they're most engaged.
|
||||
returns: (best_method, best_info, fallbacks)
|
||||
where fallbacks is a list of (method, info) tuples in activity order
|
||||
"""
|
||||
import json
|
||||
|
||||
extra = human.get('extra', {})
|
||||
contact = human.get('contact', {})
|
||||
|
||||
if isinstance(extra, str):
|
||||
extra = json.loads(extra) if extra else {}
|
||||
if isinstance(contact, str):
|
||||
contact = json.loads(contact) if contact else {}
|
||||
|
||||
nested_extra = extra.get('extra', {})
|
||||
platform = human.get('platform', '')
|
||||
|
||||
available = []
|
||||
|
||||
# === ACTIVITY SCORING ===
|
||||
# each method gets scored by how active the user is there
|
||||
|
||||
# EMAIL - always medium priority (we cant measure activity)
|
||||
email = extra.get('email') or contact.get('email') or nested_extra.get('email')
|
||||
if email and '@' in str(email):
|
||||
available.append(('email', email, 50)) # baseline score
|
||||
|
||||
# MASTODON - score by post count / followers
|
||||
mastodon = extra.get('mastodon') or contact.get('mastodon') or nested_extra.get('mastodon')
|
||||
if mastodon:
|
||||
masto_activity = extra.get('mastodon_posts', 0) or extra.get('statuses_count', 0)
|
||||
masto_score = min(100, 30 + (masto_activity // 10)) # 30 base + 1 per 10 posts
|
||||
available.append(('mastodon', mastodon, masto_score))
|
||||
|
||||
# if they CAME FROM mastodon, thats their primary
|
||||
if platform == 'mastodon':
|
||||
handle = f"@{human.get('username')}"
|
||||
instance = human.get('instance') or extra.get('instance') or ''
|
||||
if instance:
|
||||
handle = f"@{human.get('username')}@{instance}"
|
||||
activity = extra.get('statuses_count', 0) or extra.get('activity_count', 0)
|
||||
score = min(100, 50 + (activity // 5)) # higher base since its their home
|
||||
# dont dupe
|
||||
if not any(a[0] == 'mastodon' for a in available):
|
||||
available.append(('mastodon', handle, score))
|
||||
else:
|
||||
# update score if this is higher
|
||||
for i, (m, info, s) in enumerate(available):
|
||||
if m == 'mastodon' and score > s:
|
||||
available[i] = ('mastodon', handle, score)
|
||||
|
||||
# MATRIX - score by presence (binary for now)
|
||||
matrix = extra.get('matrix') or contact.get('matrix') or nested_extra.get('matrix')
|
||||
if matrix and ':' in str(matrix):
|
||||
available.append(('matrix', matrix, 40))
|
||||
|
||||
# BLUESKY - score by followers/posts if available
|
||||
bluesky = extra.get('bluesky') or contact.get('bluesky') or nested_extra.get('bluesky')
|
||||
if bluesky:
|
||||
bsky_activity = extra.get('bluesky_posts', 0)
|
||||
bsky_score = min(100, 25 + (bsky_activity // 10))
|
||||
available.append(('bluesky', bluesky, bsky_score))
|
||||
|
||||
# LEMMY - score by activity
|
||||
lemmy = extra.get('lemmy') or contact.get('lemmy') or nested_extra.get('lemmy')
|
||||
if lemmy:
|
||||
lemmy_activity = extra.get('lemmy_posts', 0) or extra.get('lemmy_comments', 0)
|
||||
lemmy_score = min(100, 30 + lemmy_activity)
|
||||
available.append(('lemmy', lemmy, lemmy_score))
|
||||
|
||||
if platform == 'lemmy':
|
||||
handle = human.get('username')
|
||||
activity = extra.get('activity_count', 0)
|
||||
score = min(100, 50 + activity)
|
||||
if not any(a[0] == 'lemmy' for a in available):
|
||||
available.append(('lemmy', handle, score))
|
||||
|
||||
# DISCORD - lower priority (hard to DM)
|
||||
discord = extra.get('discord') or contact.get('discord') or nested_extra.get('discord')
|
||||
if discord:
|
||||
available.append(('discord', discord, 20))
|
||||
|
||||
# GITHUB ISSUE - for github users, score by repo activity
|
||||
if platform == 'github':
|
||||
top_repos = extra.get('top_repos', [])
|
||||
if top_repos:
|
||||
repo = top_repos[0] if isinstance(top_repos[0], str) else top_repos[0].get('name', '')
|
||||
stars = extra.get('total_stars', 0)
|
||||
repos_count = extra.get('repos_count', 0)
|
||||
# active github user = higher issue score
|
||||
gh_score = min(60, 20 + (stars // 100) + (repos_count // 5))
|
||||
if repo:
|
||||
available.append(('github_issue', f"{human.get('username')}/{repo}", gh_score))
|
||||
|
||||
# REDDIT - discovered people, use their other links
|
||||
if platform == 'reddit':
|
||||
reddit_activity = extra.get('reddit_activity', 0) or extra.get('activity_count', 0)
|
||||
# reddit users we reach via their external links (email, mastodon, etc)
|
||||
# boost their other methods if reddit is their main platform
|
||||
for i, (m, info, score) in enumerate(available):
|
||||
if m in ('email', 'mastodon', 'matrix', 'bluesky'):
|
||||
# boost score for reddit-discovered users' external contacts
|
||||
boost = min(30, reddit_activity // 3)
|
||||
available[i] = (m, info, score + boost)
|
||||
|
||||
# sort by activity score (highest first)
|
||||
available.sort(key=lambda x: x[2], reverse=True)
|
||||
|
||||
if not available:
|
||||
return 'manual', None, []
|
||||
|
||||
best = available[0]
|
||||
fallbacks = [(m, i) for m, i, p in available[1:]]
|
||||
|
||||
return best[0], best[1], fallbacks
|
||||
|
||||
|
||||
def get_ranked_contact_methods(human):
|
||||
"""
|
||||
get all contact methods for a human, ranked by their activity.
|
||||
"""
|
||||
method, info, fallbacks = determine_contact_method(human)
|
||||
if method == 'manual':
|
||||
return []
|
||||
return [(method, info)] + fallbacks
|
||||
|
|
|
|||
|
|
@ -1,15 +1,20 @@
|
|||
"""
|
||||
matchd/overlap.py - find pairs with alignment
|
||||
|
||||
CRITICAL: blocks users with disqualifying negative signals (maga, conspiracy, conservative)
|
||||
"""
|
||||
|
||||
import json
|
||||
from .fingerprint import fingerprint_similarity
|
||||
|
||||
# signals that HARD BLOCK matching - no exceptions
|
||||
DISQUALIFYING_SIGNALS = {'maga', 'conspiracy', 'conservative', 'antivax', 'sovcit'}
|
||||
|
||||
|
||||
def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
||||
"""
|
||||
analyze overlap between two humans
|
||||
returns overlap details: score, shared values, complementary skills
|
||||
returns None if either has disqualifying signals
|
||||
"""
|
||||
# parse stored json if needed
|
||||
signals_a = human_a.get('signals', [])
|
||||
|
|
@ -20,13 +25,49 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
|||
if isinstance(signals_b, str):
|
||||
signals_b = json.loads(signals_b)
|
||||
|
||||
# === HARD BLOCK: check for disqualifying negative signals ===
|
||||
neg_a = human_a.get('negative_signals', [])
|
||||
if isinstance(neg_a, str):
|
||||
neg_a = json.loads(neg_a) if neg_a else []
|
||||
|
||||
neg_b = human_b.get('negative_signals', [])
|
||||
if isinstance(neg_b, str):
|
||||
neg_b = json.loads(neg_b) if neg_b else []
|
||||
|
||||
# also check 'reasons' field for WARNING entries
|
||||
reasons_a = human_a.get('reasons', '')
|
||||
if isinstance(reasons_a, str) and 'WARNING' in reasons_a:
|
||||
# extract signals from WARNING: x, y, z
|
||||
import re
|
||||
warn_match = re.search(r'WARNING[:\s]+([^"\]]+)', reasons_a)
|
||||
if warn_match:
|
||||
warn_signals = [s.strip().lower() for s in warn_match.group(1).split(',')]
|
||||
neg_a = list(set(neg_a + warn_signals))
|
||||
|
||||
reasons_b = human_b.get('reasons', '')
|
||||
if isinstance(reasons_b, str) and 'WARNING' in reasons_b:
|
||||
import re
|
||||
warn_match = re.search(r'WARNING[:\s]+([^"\]]+)', reasons_b)
|
||||
if warn_match:
|
||||
warn_signals = [s.strip().lower() for s in warn_match.group(1).split(',')]
|
||||
neg_b = list(set(neg_b + warn_signals))
|
||||
|
||||
# block if either has disqualifying signals
|
||||
disq_a = set(neg_a) & DISQUALIFYING_SIGNALS
|
||||
disq_b = set(neg_b) & DISQUALIFYING_SIGNALS
|
||||
|
||||
if disq_a:
|
||||
return None # blocked
|
||||
if disq_b:
|
||||
return None # blocked
|
||||
|
||||
extra_a = human_a.get('extra', {})
|
||||
if isinstance(extra_a, str):
|
||||
extra_a = json.loads(extra_a)
|
||||
extra_a = json.loads(extra_a) if extra_a else {}
|
||||
|
||||
extra_b = human_b.get('extra', {})
|
||||
if isinstance(extra_b, str):
|
||||
extra_b = json.loads(extra_b)
|
||||
extra_b = json.loads(extra_b) if extra_b else {}
|
||||
|
||||
# shared signals
|
||||
shared_signals = list(set(signals_a) & set(signals_b))
|
||||
|
|
@ -36,7 +77,7 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
|||
topics_b = set(extra_b.get('topics', []))
|
||||
shared_topics = list(topics_a & topics_b)
|
||||
|
||||
# complementary skills (what one has that the other doesn't)
|
||||
# complementary skills
|
||||
langs_a = set(extra_a.get('languages', {}).keys())
|
||||
langs_b = set(extra_b.get('languages', {}).keys())
|
||||
complementary_langs = list((langs_a - langs_b) | (langs_b - langs_a))
|
||||
|
|
@ -68,38 +109,30 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
|||
|
||||
# calculate overlap score
|
||||
base_score = 0
|
||||
|
||||
# shared values (most important)
|
||||
base_score += len(shared_signals) * 10
|
||||
|
||||
# shared interests
|
||||
base_score += len(shared_topics) * 5
|
||||
|
||||
# complementary skills bonus (they can help each other)
|
||||
if complementary_langs:
|
||||
base_score += min(len(complementary_langs), 5) * 3
|
||||
|
||||
# geographic bonus
|
||||
if geographic_match:
|
||||
base_score += 20
|
||||
|
||||
# fingerprint similarity if available
|
||||
fp_score = 0
|
||||
if fp_a and fp_b:
|
||||
fp_score = fingerprint_similarity(fp_a, fp_b) * 50
|
||||
|
||||
total_score = base_score + fp_score
|
||||
|
||||
# build reasons
|
||||
overlap_reasons = []
|
||||
if shared_signals:
|
||||
overlap_reasons.append(f"shared values: {', '.join(shared_signals[:5])}")
|
||||
overlap_reasons.append(f"shared: {', '.join(shared_signals[:5])}")
|
||||
if shared_topics:
|
||||
overlap_reasons.append(f"shared interests: {', '.join(shared_topics[:5])}")
|
||||
overlap_reasons.append(f"interests: {', '.join(shared_topics[:5])}")
|
||||
if geo_reason:
|
||||
overlap_reasons.append(geo_reason)
|
||||
if complementary_langs:
|
||||
overlap_reasons.append(f"complementary skills: {', '.join(complementary_langs[:5])}")
|
||||
overlap_reasons.append(f"complementary: {', '.join(complementary_langs[:5])}")
|
||||
|
||||
return {
|
||||
'overlap_score': total_score,
|
||||
|
|
@ -114,36 +147,28 @@ def find_overlap(human_a, human_b, fp_a=None, fp_b=None):
|
|||
|
||||
|
||||
def is_same_person(human_a, human_b):
|
||||
"""
|
||||
check if two records might be the same person (cross-platform)
|
||||
"""
|
||||
# same platform = definitely different records
|
||||
"""check if two records might be the same person (cross-platform)"""
|
||||
if human_a['platform'] == human_b['platform']:
|
||||
return False
|
||||
|
||||
# check username similarity
|
||||
user_a = human_a.get('username', '').lower().split('@')[0]
|
||||
user_b = human_b.get('username', '').lower().split('@')[0]
|
||||
|
||||
if user_a == user_b:
|
||||
return True
|
||||
|
||||
# check if github username matches
|
||||
contact_a = human_a.get('contact', {})
|
||||
contact_b = human_b.get('contact', {})
|
||||
|
||||
if isinstance(contact_a, str):
|
||||
contact_a = json.loads(contact_a)
|
||||
contact_a = json.loads(contact_a) if contact_a else {}
|
||||
if isinstance(contact_b, str):
|
||||
contact_b = json.loads(contact_b)
|
||||
contact_b = json.loads(contact_b) if contact_b else {}
|
||||
|
||||
# github cross-reference
|
||||
if contact_a.get('github') and contact_a.get('github') == contact_b.get('github'):
|
||||
return True
|
||||
if contact_a.get('github') == user_b or contact_b.get('github') == user_a:
|
||||
return True
|
||||
|
||||
# email cross-reference
|
||||
if contact_a.get('email') and contact_a.get('email') == contact_b.get('email'):
|
||||
return True
|
||||
|
||||
|
|
|
|||
689
profile_page.py
Normal file
689
profile_page.py
Normal file
|
|
@ -0,0 +1,689 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
profile page template and helpers for connectd
|
||||
comprehensive "get to know" page showing ALL data
|
||||
"""
|
||||
|
||||
import json
|
||||
from urllib.parse import quote
|
||||
|
||||
PROFILE_HTML = """<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{name} | connectd</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<style>
|
||||
* {{ box-sizing: border-box; margin: 0; padding: 0; }}
|
||||
body {{
|
||||
font-family: 'SF Mono', 'Monaco', 'Inconsolata', monospace;
|
||||
background: #0a0a0f;
|
||||
color: #e0e0e0;
|
||||
line-height: 1.6;
|
||||
}}
|
||||
|
||||
.container {{ max-width: 900px; margin: 0 auto; padding: 20px; }}
|
||||
|
||||
/* header */
|
||||
.header {{
|
||||
display: flex;
|
||||
gap: 24px;
|
||||
align-items: flex-start;
|
||||
padding: 30px;
|
||||
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
|
||||
border-radius: 12px;
|
||||
margin-bottom: 24px;
|
||||
border: 1px solid #333;
|
||||
}}
|
||||
.avatar {{
|
||||
width: 120px;
|
||||
height: 120px;
|
||||
border-radius: 50%;
|
||||
background: linear-gradient(135deg, #c792ea 0%, #82aaff 100%);
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-size: 48px;
|
||||
color: #0a0a0f;
|
||||
font-weight: bold;
|
||||
flex-shrink: 0;
|
||||
}}
|
||||
.avatar img {{ width: 100%; height: 100%; border-radius: 50%; object-fit: cover; }}
|
||||
.header-info {{ flex: 1; }}
|
||||
.name {{ font-size: 2em; color: #c792ea; margin-bottom: 4px; }}
|
||||
.username {{ color: #82aaff; font-size: 1.1em; margin-bottom: 8px; }}
|
||||
.location {{ color: #0f8; margin-bottom: 8px; }}
|
||||
.pronouns {{
|
||||
display: inline-block;
|
||||
background: #2d3a4a;
|
||||
padding: 2px 10px;
|
||||
border-radius: 12px;
|
||||
font-size: 0.85em;
|
||||
color: #f7c;
|
||||
}}
|
||||
.score-badge {{
|
||||
display: inline-block;
|
||||
background: linear-gradient(135deg, #c792ea 0%, #f7c 100%);
|
||||
color: #0a0a0f;
|
||||
padding: 4px 12px;
|
||||
border-radius: 20px;
|
||||
font-weight: bold;
|
||||
margin-left: 12px;
|
||||
}}
|
||||
.user-type {{
|
||||
display: inline-block;
|
||||
padding: 2px 10px;
|
||||
border-radius: 12px;
|
||||
font-size: 0.85em;
|
||||
margin-left: 8px;
|
||||
}}
|
||||
.user-type.builder {{ background: #2d4a2d; color: #8f8; }}
|
||||
.user-type.lost {{ background: #4a2d2d; color: #f88; }}
|
||||
.user-type.none {{ background: #333; color: #888; }}
|
||||
|
||||
/* bio section */
|
||||
.bio {{
|
||||
background: #1a1a2e;
|
||||
padding: 24px;
|
||||
border-radius: 12px;
|
||||
margin-bottom: 24px;
|
||||
border: 1px solid #333;
|
||||
font-size: 1.1em;
|
||||
color: #ddd;
|
||||
font-style: italic;
|
||||
}}
|
||||
.bio:empty {{ display: none; }}
|
||||
|
||||
/* sections */
|
||||
.section {{
|
||||
background: #1a1a2e;
|
||||
border-radius: 12px;
|
||||
margin-bottom: 20px;
|
||||
border: 1px solid #333;
|
||||
overflow: hidden;
|
||||
}}
|
||||
.section-header {{
|
||||
background: #2a2a4e;
|
||||
padding: 14px 20px;
|
||||
color: #82aaff;
|
||||
font-size: 1.1em;
|
||||
cursor: pointer;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
}}
|
||||
.section-header:hover {{ background: #3a3a5e; }}
|
||||
.section-header .toggle {{ color: #666; }}
|
||||
.section-content {{ padding: 20px; }}
|
||||
.section-content.collapsed {{ display: none; }}
|
||||
|
||||
/* platforms/handles */
|
||||
.platforms {{
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 12px;
|
||||
}}
|
||||
.platform {{
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
background: #0d0d15;
|
||||
padding: 10px 16px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #333;
|
||||
}}
|
||||
.platform:hover {{ border-color: #0f8; }}
|
||||
.platform-icon {{ font-size: 1.2em; }}
|
||||
.platform a {{ color: #82aaff; text-decoration: none; }}
|
||||
.platform a:hover {{ color: #0f8; }}
|
||||
.platform-main {{ color: #c792ea; font-weight: bold; }}
|
||||
|
||||
/* signals/tags */
|
||||
.tags {{
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 8px;
|
||||
}}
|
||||
.tag {{
|
||||
background: #2d3a4a;
|
||||
color: #82aaff;
|
||||
padding: 6px 14px;
|
||||
border-radius: 20px;
|
||||
font-size: 0.9em;
|
||||
cursor: pointer;
|
||||
transition: all 0.2s;
|
||||
}}
|
||||
.tag:hover {{ background: #3d4a5a; transform: scale(1.05); }}
|
||||
.tag.positive {{ background: #2d4a2d; color: #8f8; }}
|
||||
.tag.negative {{ background: #4a2d2d; color: #f88; }}
|
||||
.tag.rare {{ background: linear-gradient(135deg, #c792ea 0%, #f7c 100%); color: #0a0a0f; }}
|
||||
.tag-detail {{
|
||||
display: none;
|
||||
background: #0d0d15;
|
||||
padding: 10px;
|
||||
border-radius: 6px;
|
||||
margin-top: 8px;
|
||||
font-size: 0.85em;
|
||||
color: #888;
|
||||
}}
|
||||
|
||||
/* repos */
|
||||
.repos {{ display: flex; flex-direction: column; gap: 12px; }}
|
||||
.repo {{
|
||||
background: #0d0d15;
|
||||
padding: 16px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #333;
|
||||
}}
|
||||
.repo:hover {{ border-color: #c792ea; }}
|
||||
.repo-header {{ display: flex; justify-content: space-between; align-items: center; margin-bottom: 8px; }}
|
||||
.repo-name {{ color: #c792ea; font-weight: bold; }}
|
||||
.repo-name a {{ color: #c792ea; text-decoration: none; }}
|
||||
.repo-name a:hover {{ color: #f7c; }}
|
||||
.repo-stats {{ display: flex; gap: 16px; }}
|
||||
.repo-stat {{ color: #888; font-size: 0.85em; }}
|
||||
.repo-stat .star {{ color: #ffd700; }}
|
||||
.repo-desc {{ color: #aaa; font-size: 0.9em; }}
|
||||
.repo-lang {{
|
||||
display: inline-block;
|
||||
background: #333;
|
||||
padding: 2px 8px;
|
||||
border-radius: 4px;
|
||||
font-size: 0.8em;
|
||||
color: #0f8;
|
||||
}}
|
||||
|
||||
/* languages */
|
||||
.languages {{ display: flex; flex-wrap: wrap; gap: 8px; }}
|
||||
.lang {{
|
||||
background: #0d0d15;
|
||||
padding: 8px 14px;
|
||||
border-radius: 6px;
|
||||
border: 1px solid #333;
|
||||
}}
|
||||
.lang-name {{ color: #0f8; }}
|
||||
.lang-count {{ color: #666; font-size: 0.85em; margin-left: 6px; }}
|
||||
|
||||
/* subreddits */
|
||||
.subreddits {{ display: flex; flex-wrap: wrap; gap: 8px; }}
|
||||
.subreddit {{
|
||||
background: #ff4500;
|
||||
color: white;
|
||||
padding: 6px 12px;
|
||||
border-radius: 20px;
|
||||
font-size: 0.9em;
|
||||
}}
|
||||
.subreddit a {{ color: white; text-decoration: none; }}
|
||||
|
||||
/* matches */
|
||||
.match-summary {{
|
||||
display: flex;
|
||||
gap: 20px;
|
||||
flex-wrap: wrap;
|
||||
}}
|
||||
.match-stat {{
|
||||
background: #0d0d15;
|
||||
padding: 16px 24px;
|
||||
border-radius: 8px;
|
||||
text-align: center;
|
||||
}}
|
||||
.match-stat b {{ font-size: 2em; color: #c792ea; display: block; }}
|
||||
.match-stat small {{ color: #666; }}
|
||||
|
||||
/* raw data */
|
||||
.raw-data {{
|
||||
background: #0d0d15;
|
||||
padding: 16px;
|
||||
border-radius: 8px;
|
||||
overflow-x: auto;
|
||||
font-size: 0.85em;
|
||||
color: #888;
|
||||
}}
|
||||
pre {{ white-space: pre-wrap; word-break: break-all; }}
|
||||
|
||||
/* contact */
|
||||
.contact-methods {{ display: flex; flex-direction: column; gap: 12px; }}
|
||||
.contact-method {{
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 12px;
|
||||
background: #0d0d15;
|
||||
padding: 14px 20px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #333;
|
||||
}}
|
||||
.contact-method.preferred {{ border-color: #0f8; background: #1a2a1a; }}
|
||||
.contact-method a {{ color: #82aaff; text-decoration: none; }}
|
||||
.contact-method a:hover {{ color: #0f8; }}
|
||||
|
||||
/* reasons */
|
||||
.reasons {{ display: flex; flex-direction: column; gap: 8px; }}
|
||||
.reason {{
|
||||
background: #0d0d15;
|
||||
padding: 10px 14px;
|
||||
border-radius: 6px;
|
||||
color: #aaa;
|
||||
font-size: 0.9em;
|
||||
border-left: 3px solid #c792ea;
|
||||
}}
|
||||
|
||||
/* back link */
|
||||
.back {{
|
||||
display: inline-block;
|
||||
color: #666;
|
||||
text-decoration: none;
|
||||
margin-bottom: 20px;
|
||||
}}
|
||||
.back:hover {{ color: #0f8; }}
|
||||
|
||||
/* footer */
|
||||
.footer {{
|
||||
text-align: center;
|
||||
padding: 30px;
|
||||
color: #444;
|
||||
font-size: 0.85em;
|
||||
}}
|
||||
.footer a {{ color: #666; }}
|
||||
|
||||
/* responsive */
|
||||
@media (max-width: 600px) {{
|
||||
.header {{ flex-direction: column; align-items: center; text-align: center; }}
|
||||
.avatar {{ width: 100px; height: 100px; }}
|
||||
.name {{ font-size: 1.5em; }}
|
||||
}}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<a href="/" class="back">← back to dashboard</a>
|
||||
|
||||
<!-- HEADER -->
|
||||
<div class="header">
|
||||
<div class="avatar">{avatar}</div>
|
||||
<div class="header-info">
|
||||
<div class="name">
|
||||
{name}
|
||||
<span class="score-badge">{score}</span>
|
||||
<span class="user-type {user_type_class}">{user_type}</span>
|
||||
</div>
|
||||
<div class="username">@{username} on {platform}</div>
|
||||
{location_html}
|
||||
{pronouns_html}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- BIO -->
|
||||
<div class="bio">{bio}</div>
|
||||
|
||||
<!-- WHERE TO FIND THEM -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>🌐 where to find them</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
<div class="platforms">
|
||||
{platforms_html}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- WHAT THEY BUILD -->
|
||||
{repos_section}
|
||||
|
||||
<!-- WHAT THEY CARE ABOUT -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>💜 what they care about ({signal_count} signals)</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
<div class="tags">
|
||||
{signals_html}
|
||||
</div>
|
||||
{negative_signals_html}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- WHY THEY SCORED -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>📊 why they scored {score}</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
<div class="reasons">
|
||||
{reasons_html}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- COMMUNITIES -->
|
||||
{communities_section}
|
||||
|
||||
<!-- MATCHING -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>🤝 in the network</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
<div class="match-summary">
|
||||
<div class="match-stat">
|
||||
<b>{match_count}</b>
|
||||
<small>matches</small>
|
||||
</div>
|
||||
<div class="match-stat">
|
||||
<b>{lost_score}</b>
|
||||
<small>lost potential</small>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- CONTACT -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>📬 how to connect</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
{contact_html}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- RAW DATA -->
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>🔍 the data (everything connectd knows)</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content collapsed">
|
||||
<p style="color: #666; margin-bottom: 16px;">
|
||||
public data is public. this is everything we've gathered from public sources.
|
||||
</p>
|
||||
<div class="raw-data">
|
||||
<pre>{raw_json}</pre>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="footer">
|
||||
connectd · public data is public ·
|
||||
<a href="/api/humans/{id}/full">raw json</a>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
function toggleSection(header) {{
|
||||
var content = header.nextElementSibling;
|
||||
var toggle = header.querySelector('.toggle');
|
||||
if (content.classList.contains('collapsed')) {{
|
||||
content.classList.remove('collapsed');
|
||||
toggle.textContent = '▼';
|
||||
}} else {{
|
||||
content.classList.add('collapsed');
|
||||
toggle.textContent = '▶';
|
||||
}}
|
||||
}}
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
|
||||
|
||||
RARE_SIGNALS = {'queer', 'solarpunk', 'cooperative', 'intentional_community', 'trans', 'nonbinary'}
|
||||
|
||||
def parse_json_field(val):
|
||||
"""safely parse json string or return as-is"""
|
||||
if isinstance(val, str):
|
||||
try:
|
||||
return json.loads(val)
|
||||
except:
|
||||
return val
|
||||
return val or {}
|
||||
|
||||
|
||||
def render_profile(human, match_count=0):
|
||||
"""render full profile page for a human"""
|
||||
|
||||
# parse json fields
|
||||
signals = parse_json_field(human.get('signals', '[]'))
|
||||
if isinstance(signals, str):
|
||||
signals = []
|
||||
|
||||
negative_signals = parse_json_field(human.get('negative_signals', '[]'))
|
||||
if isinstance(negative_signals, str):
|
||||
negative_signals = []
|
||||
|
||||
reasons = parse_json_field(human.get('reasons', '[]'))
|
||||
if isinstance(reasons, str):
|
||||
reasons = []
|
||||
|
||||
contact = parse_json_field(human.get('contact', '{}'))
|
||||
extra = parse_json_field(human.get('extra', '{}'))
|
||||
|
||||
# nested extra sometimes
|
||||
if 'extra' in extra:
|
||||
extra = {**extra, **parse_json_field(extra['extra'])}
|
||||
|
||||
# basic info
|
||||
name = human.get('name') or human.get('username', 'unknown')
|
||||
username = human.get('username', 'unknown')
|
||||
platform = human.get('platform', 'unknown')
|
||||
bio = human.get('bio', '')
|
||||
location = human.get('location') or extra.get('location', '')
|
||||
score = human.get('score', 0)
|
||||
user_type = human.get('user_type', 'none')
|
||||
lost_score = human.get('lost_potential_score', 0)
|
||||
|
||||
# avatar - first letter or image
|
||||
avatar_html = name[0].upper() if name else '?'
|
||||
avatar_url = extra.get('avatar_url') or extra.get('profile_image')
|
||||
if avatar_url:
|
||||
avatar_html = f'<img src="{avatar_url}" alt="{name}">'
|
||||
|
||||
# location html
|
||||
location_html = f'<div class="location">📍 {location}</div>' if location else ''
|
||||
|
||||
# pronouns - try to detect
|
||||
pronouns = extra.get('pronouns', '')
|
||||
if not pronouns and bio:
|
||||
bio_lower = bio.lower()
|
||||
if 'she/her' in bio_lower:
|
||||
pronouns = 'she/her'
|
||||
elif 'he/him' in bio_lower:
|
||||
pronouns = 'he/him'
|
||||
elif 'they/them' in bio_lower:
|
||||
pronouns = 'they/them'
|
||||
pronouns_html = f'<span class="pronouns">{pronouns}</span>' if pronouns else ''
|
||||
|
||||
# platforms/handles
|
||||
handles = extra.get('handles', {})
|
||||
platforms_html = []
|
||||
|
||||
# main platform
|
||||
if platform == 'github':
|
||||
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">💻</span><a href="https://github.com/{username}" target="_blank">github.com/{username}</a></div>')
|
||||
elif platform == 'reddit':
|
||||
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🔴</span><a href="https://reddit.com/u/{username}" target="_blank">u/{username}</a></div>')
|
||||
elif platform == 'mastodon':
|
||||
instance = human.get('instance', 'mastodon.social')
|
||||
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🐘</span><a href="https://{instance}/@{username}" target="_blank">@{username}@{instance}</a></div>')
|
||||
elif platform == 'lobsters':
|
||||
platforms_html.append(f'<div class="platform platform-main"><span class="platform-icon">🦞</span><a href="https://lobste.rs/u/{username}" target="_blank">lobste.rs/u/{username}</a></div>')
|
||||
|
||||
# other handles
|
||||
if handles.get('github') and platform != 'github':
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">💻</span><a href="https://github.com/{handles["github"]}" target="_blank">github.com/{handles["github"]}</a></div>')
|
||||
if handles.get('twitter'):
|
||||
t = handles['twitter'].lstrip('@')
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">🐦</span><a href="https://twitter.com/{t}" target="_blank">@{t}</a></div>')
|
||||
if handles.get('mastodon') and platform != 'mastodon':
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">🐘</span>{handles["mastodon"]}</div>')
|
||||
if handles.get('bluesky'):
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">🦋</span>{handles["bluesky"]}</div>')
|
||||
if handles.get('linkedin'):
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">💼</span><a href="https://linkedin.com/in/{handles["linkedin"]}" target="_blank">linkedin</a></div>')
|
||||
if handles.get('matrix'):
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">💬</span>{handles["matrix"]}</div>')
|
||||
|
||||
# contact methods
|
||||
if contact.get('blog'):
|
||||
platforms_html.append(f'<div class="platform"><span class="platform-icon">🌐</span><a href="{contact["blog"]}" target="_blank">{contact["blog"]}</a></div>')
|
||||
|
||||
# signals html
|
||||
signals_html = []
|
||||
for sig in signals:
|
||||
cls = 'tag'
|
||||
if sig in RARE_SIGNALS:
|
||||
cls = 'tag rare'
|
||||
signals_html.append(f'<span class="{cls}">{sig}</span>')
|
||||
|
||||
# negative signals
|
||||
negative_signals_html = ''
|
||||
if negative_signals:
|
||||
neg_tags = ' '.join([f'<span class="tag negative">{s}</span>' for s in negative_signals])
|
||||
negative_signals_html = f'<div style="margin-top: 16px;"><small style="color: #666;">negative signals:</small><br><div class="tags" style="margin-top: 8px;">{neg_tags}</div></div>'
|
||||
|
||||
# reasons html
|
||||
reasons_html = '\n'.join([f'<div class="reason">{r}</div>' for r in reasons]) if reasons else '<div class="reason">no specific reasons recorded</div>'
|
||||
|
||||
# repos section
|
||||
repos_section = ''
|
||||
top_repos = extra.get('top_repos', [])
|
||||
languages = extra.get('languages', {})
|
||||
repo_count = extra.get('repo_count', 0)
|
||||
total_stars = extra.get('total_stars', 0)
|
||||
|
||||
if top_repos or languages:
|
||||
repos_html = ''
|
||||
if top_repos:
|
||||
for repo in top_repos[:6]:
|
||||
repo_name = repo.get('name', 'unknown')
|
||||
repo_desc = repo.get('description', '')[:200] or 'no description'
|
||||
repo_stars = repo.get('stars', 0)
|
||||
repo_lang = repo.get('language', '')
|
||||
lang_badge = f'<span class="repo-lang">{repo_lang}</span>' if repo_lang else ''
|
||||
|
||||
repos_html += f'''
|
||||
<div class="repo">
|
||||
<div class="repo-header">
|
||||
<span class="repo-name"><a href="https://github.com/{username}/{repo_name}" target="_blank">{repo_name}</a></span>
|
||||
<div class="repo-stats">
|
||||
<span class="repo-stat"><span class="star">★</span> {repo_stars:,}</span>
|
||||
{lang_badge}
|
||||
</div>
|
||||
</div>
|
||||
<div class="repo-desc">{repo_desc}</div>
|
||||
</div>
|
||||
'''
|
||||
|
||||
# languages
|
||||
langs_html = ''
|
||||
if languages:
|
||||
sorted_langs = sorted(languages.items(), key=lambda x: x[1], reverse=True)[:10]
|
||||
for lang, count in sorted_langs:
|
||||
langs_html += f'<div class="lang"><span class="lang-name">{lang}</span><span class="lang-count">×{count}</span></div>'
|
||||
|
||||
repos_section = f'''
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>🔨 what they build ({repo_count} repos, {total_stars:,} ★)</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
<div class="languages" style="margin-bottom: 16px;">
|
||||
{langs_html}
|
||||
</div>
|
||||
<div class="repos">
|
||||
{repos_html}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
'''
|
||||
|
||||
# communities section (subreddits, etc)
|
||||
communities_section = ''
|
||||
subreddits = extra.get('subreddits', [])
|
||||
topics = extra.get('topics', [])
|
||||
|
||||
if subreddits or topics:
|
||||
subs_html = ''
|
||||
if subreddits:
|
||||
subs_html = '<div style="margin-bottom: 16px;"><small style="color: #666;">subreddits:</small><div class="subreddits" style="margin-top: 8px;">'
|
||||
for sub in subreddits:
|
||||
subs_html += f'<span class="subreddit"><a href="https://reddit.com/r/{sub}" target="_blank">r/{sub}</a></span>'
|
||||
subs_html += '</div></div>'
|
||||
|
||||
topics_html = ''
|
||||
if topics:
|
||||
topics_html = '<div><small style="color: #666;">topics:</small><div class="tags" style="margin-top: 8px;">'
|
||||
for topic in topics:
|
||||
topics_html += f'<span class="tag">{topic}</span>'
|
||||
topics_html += '</div></div>'
|
||||
|
||||
communities_section = f'''
|
||||
<div class="section">
|
||||
<div class="section-header" onclick="toggleSection(this)">
|
||||
<span>👥 communities</span>
|
||||
<span class="toggle">▼</span>
|
||||
</div>
|
||||
<div class="section-content">
|
||||
{subs_html}
|
||||
{topics_html}
|
||||
</div>
|
||||
</div>
|
||||
'''
|
||||
|
||||
# contact section
|
||||
contact_html = '<div class="contact-methods">'
|
||||
emails = contact.get('emails', [])
|
||||
if contact.get('email') and contact['email'] not in emails:
|
||||
emails = [contact['email']] + emails
|
||||
|
||||
if emails:
|
||||
for i, email in enumerate(emails[:3]):
|
||||
preferred = 'preferred' if i == 0 else ''
|
||||
contact_html += f'<div class="contact-method {preferred}"><span>📧</span><a href="mailto:{email}">{email}</a></div>'
|
||||
|
||||
if contact.get('mastodon'):
|
||||
contact_html += f'<div class="contact-method"><span>🐘</span>{contact["mastodon"]}</div>'
|
||||
if contact.get('matrix'):
|
||||
contact_html += f'<div class="contact-method"><span>💬</span>{contact["matrix"]}</div>'
|
||||
if contact.get('twitter'):
|
||||
contact_html += f'<div class="contact-method"><span>🐦</span>@{contact["twitter"]}</div>'
|
||||
|
||||
if not emails and not contact.get('mastodon') and not contact.get('matrix'):
|
||||
contact_html += '<div class="contact-method">no contact methods discovered</div>'
|
||||
|
||||
contact_html += '</div>'
|
||||
|
||||
# raw json
|
||||
raw_json = json.dumps(human, indent=2, default=str)
|
||||
|
||||
# render
|
||||
return PROFILE_HTML.format(
|
||||
name=name,
|
||||
username=username,
|
||||
platform=platform,
|
||||
bio=bio,
|
||||
score=int(score),
|
||||
user_type=user_type,
|
||||
user_type_class=user_type,
|
||||
avatar=avatar_html,
|
||||
location_html=location_html,
|
||||
pronouns_html=pronouns_html,
|
||||
platforms_html='\n'.join(platforms_html),
|
||||
signals_html='\n'.join(signals_html),
|
||||
signal_count=len(signals),
|
||||
negative_signals_html=negative_signals_html,
|
||||
reasons_html=reasons_html,
|
||||
repos_section=repos_section,
|
||||
communities_section=communities_section,
|
||||
match_count=match_count,
|
||||
lost_score=int(lost_score),
|
||||
contact_html=contact_html,
|
||||
raw_json=raw_json,
|
||||
id=human.get('id', 0)
|
||||
)
|
||||
|
|
@ -1,2 +1,3 @@
|
|||
requests>=2.28.0
|
||||
beautifulsoup4>=4.12.0
|
||||
groq>=0.4.0
|
||||
|
|
|
|||
491
scoutd/forges.py
Normal file
491
scoutd/forges.py
Normal file
|
|
@ -0,0 +1,491 @@
|
|||
"""
|
||||
scoutd/forges.py - scrape self-hosted git forges
|
||||
|
||||
these people = highest signal. they actually selfhost.
|
||||
|
||||
supported platforms:
|
||||
- gitea (and forks like forgejo)
|
||||
- gogs
|
||||
- gitlab ce
|
||||
- sourcehut
|
||||
- codeberg (gitea-based)
|
||||
|
||||
scrapes users AND extracts contact info for outreach.
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
import json
|
||||
import time
|
||||
import requests
|
||||
from typing import List, Dict, Optional, Tuple
|
||||
from datetime import datetime
|
||||
|
||||
from .signals import analyze_text
|
||||
|
||||
# rate limiting
|
||||
REQUEST_DELAY = 1.0
|
||||
|
||||
# known public instances to scrape
|
||||
# format: (name, url, platform_type)
|
||||
KNOWN_INSTANCES = [
|
||||
# === PUBLIC INSTANCES ===
|
||||
# local/private instances can be added via LOCAL_FORGE_INSTANCES env var
|
||||
# codeberg (largest gitea instance)
|
||||
('codeberg', 'https://codeberg.org', 'gitea'),
|
||||
|
||||
# sourcehut
|
||||
('sourcehut', 'https://sr.ht', 'sourcehut'),
|
||||
|
||||
# notable gitea/forgejo instances
|
||||
('gitea.com', 'https://gitea.com', 'gitea'),
|
||||
('git.disroot.org', 'https://git.disroot.org', 'gitea'),
|
||||
('git.gay', 'https://git.gay', 'forgejo'),
|
||||
('git.envs.net', 'https://git.envs.net', 'forgejo'),
|
||||
('tildegit', 'https://tildegit.org', 'gitea'),
|
||||
('git.sr.ht', 'https://git.sr.ht', 'sourcehut'),
|
||||
|
||||
# gitlab ce instances
|
||||
('framagit', 'https://framagit.org', 'gitlab'),
|
||||
('gitlab.gnome.org', 'https://gitlab.gnome.org', 'gitlab'),
|
||||
('invent.kde.org', 'https://invent.kde.org', 'gitlab'),
|
||||
('salsa.debian.org', 'https://salsa.debian.org', 'gitlab'),
|
||||
]
|
||||
|
||||
# headers
|
||||
HEADERS = {
|
||||
'User-Agent': 'connectd/1.0 (finding builders with aligned values)',
|
||||
'Accept': 'application/json',
|
||||
}
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(f" forges: {msg}")
|
||||
|
||||
|
||||
# === GITEA/FORGEJO/GOGS API ===
|
||||
# these share the same API structure
|
||||
|
||||
def scrape_gitea_users(instance_url: str, limit: int = 100) -> List[Dict]:
|
||||
"""
|
||||
scrape users from a gitea/forgejo/gogs instance.
|
||||
uses the explore/users page or API if available.
|
||||
"""
|
||||
users = []
|
||||
|
||||
# try API first (gitea 1.x+)
|
||||
try:
|
||||
api_url = f"{instance_url}/api/v1/users/search"
|
||||
params = {'q': '', 'limit': min(limit, 50)}
|
||||
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=15)
|
||||
|
||||
if resp.status_code == 200:
|
||||
data = resp.json()
|
||||
user_list = data.get('data', []) or data.get('users', []) or data
|
||||
if isinstance(user_list, list):
|
||||
for u in user_list[:limit]:
|
||||
users.append({
|
||||
'username': u.get('login') or u.get('username'),
|
||||
'full_name': u.get('full_name'),
|
||||
'avatar': u.get('avatar_url'),
|
||||
'website': u.get('website'),
|
||||
'location': u.get('location'),
|
||||
'bio': u.get('description') or u.get('bio'),
|
||||
})
|
||||
log(f" got {len(users)} users via API")
|
||||
except Exception as e:
|
||||
log(f" API failed: {e}")
|
||||
|
||||
# fallback: scrape explore page
|
||||
if not users:
|
||||
try:
|
||||
explore_url = f"{instance_url}/explore/users"
|
||||
resp = requests.get(explore_url, headers=HEADERS, timeout=15)
|
||||
if resp.status_code == 200:
|
||||
# parse HTML for usernames
|
||||
usernames = re.findall(r'href="/([^/"]+)"[^>]*class="[^"]*user[^"]*"', resp.text)
|
||||
usernames += re.findall(r'<a[^>]+href="/([^/"]+)"[^>]*title="[^"]*"', resp.text)
|
||||
usernames = list(set(usernames))[:limit]
|
||||
for username in usernames:
|
||||
if username and not username.startswith(('explore', 'api', 'user', 'repo')):
|
||||
users.append({'username': username})
|
||||
log(f" got {len(users)} users via scrape")
|
||||
except Exception as e:
|
||||
log(f" scrape failed: {e}")
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def get_gitea_user_details(instance_url: str, username: str) -> Optional[Dict]:
|
||||
"""get detailed user info from gitea/forgejo/gogs"""
|
||||
try:
|
||||
# API endpoint
|
||||
api_url = f"{instance_url}/api/v1/users/{username}"
|
||||
resp = requests.get(api_url, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
u = resp.json()
|
||||
return {
|
||||
'username': u.get('login') or u.get('username'),
|
||||
'full_name': u.get('full_name'),
|
||||
'email': u.get('email'), # may be hidden
|
||||
'website': u.get('website'),
|
||||
'location': u.get('location'),
|
||||
'bio': u.get('description') or u.get('bio'),
|
||||
'created': u.get('created'),
|
||||
'followers': u.get('followers_count', 0),
|
||||
'following': u.get('following_count', 0),
|
||||
}
|
||||
except:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def get_gitea_user_repos(instance_url: str, username: str, limit: int = 10) -> List[Dict]:
|
||||
"""get user's repos from gitea/forgejo/gogs"""
|
||||
repos = []
|
||||
try:
|
||||
api_url = f"{instance_url}/api/v1/users/{username}/repos"
|
||||
resp = requests.get(api_url, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
for r in resp.json()[:limit]:
|
||||
repos.append({
|
||||
'name': r.get('name'),
|
||||
'full_name': r.get('full_name'),
|
||||
'description': r.get('description'),
|
||||
'stars': r.get('stars_count', 0),
|
||||
'forks': r.get('forks_count', 0),
|
||||
'language': r.get('language'),
|
||||
'updated': r.get('updated_at'),
|
||||
})
|
||||
except:
|
||||
pass
|
||||
return repos
|
||||
|
||||
|
||||
# === GITLAB CE API ===
|
||||
|
||||
def scrape_gitlab_users(instance_url: str, limit: int = 100) -> List[Dict]:
|
||||
"""scrape users from a gitlab ce instance"""
|
||||
users = []
|
||||
|
||||
try:
|
||||
# gitlab API - public users endpoint
|
||||
api_url = f"{instance_url}/api/v4/users"
|
||||
params = {'per_page': min(limit, 100), 'active': True}
|
||||
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=15)
|
||||
|
||||
if resp.status_code == 200:
|
||||
for u in resp.json()[:limit]:
|
||||
users.append({
|
||||
'username': u.get('username'),
|
||||
'full_name': u.get('name'),
|
||||
'avatar': u.get('avatar_url'),
|
||||
'website': u.get('website_url'),
|
||||
'location': u.get('location'),
|
||||
'bio': u.get('bio'),
|
||||
'public_email': u.get('public_email'),
|
||||
})
|
||||
log(f" got {len(users)} gitlab users")
|
||||
except Exception as e:
|
||||
log(f" gitlab API failed: {e}")
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def get_gitlab_user_details(instance_url: str, username: str) -> Optional[Dict]:
|
||||
"""get detailed gitlab user info"""
|
||||
try:
|
||||
api_url = f"{instance_url}/api/v4/users"
|
||||
params = {'username': username}
|
||||
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
users = resp.json()
|
||||
if users:
|
||||
u = users[0]
|
||||
return {
|
||||
'username': u.get('username'),
|
||||
'full_name': u.get('name'),
|
||||
'email': u.get('public_email'),
|
||||
'website': u.get('website_url'),
|
||||
'location': u.get('location'),
|
||||
'bio': u.get('bio'),
|
||||
'created': u.get('created_at'),
|
||||
}
|
||||
except:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def get_gitlab_user_projects(instance_url: str, username: str, limit: int = 10) -> List[Dict]:
|
||||
"""get user's projects from gitlab"""
|
||||
repos = []
|
||||
try:
|
||||
# first get user id
|
||||
api_url = f"{instance_url}/api/v4/users"
|
||||
params = {'username': username}
|
||||
resp = requests.get(api_url, params=params, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200 and resp.json():
|
||||
user_id = resp.json()[0].get('id')
|
||||
|
||||
# get projects
|
||||
proj_url = f"{instance_url}/api/v4/users/{user_id}/projects"
|
||||
resp = requests.get(proj_url, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
for p in resp.json()[:limit]:
|
||||
repos.append({
|
||||
'name': p.get('name'),
|
||||
'full_name': p.get('path_with_namespace'),
|
||||
'description': p.get('description'),
|
||||
'stars': p.get('star_count', 0),
|
||||
'forks': p.get('forks_count', 0),
|
||||
'updated': p.get('last_activity_at'),
|
||||
})
|
||||
except:
|
||||
pass
|
||||
return repos
|
||||
|
||||
|
||||
# === SOURCEHUT API ===
|
||||
|
||||
def scrape_sourcehut_users(limit: int = 100) -> List[Dict]:
|
||||
"""
|
||||
scrape users from sourcehut.
|
||||
sourcehut doesn't have a public user list, so we scrape from:
|
||||
- recent commits
|
||||
- mailing lists
|
||||
- project pages
|
||||
"""
|
||||
users = []
|
||||
seen = set()
|
||||
|
||||
try:
|
||||
# scrape from git.sr.ht explore
|
||||
resp = requests.get('https://git.sr.ht/projects', headers=HEADERS, timeout=15)
|
||||
if resp.status_code == 200:
|
||||
# extract usernames from repo paths like ~username/repo
|
||||
usernames = re.findall(r'href="/~([^/"]+)', resp.text)
|
||||
for username in usernames:
|
||||
if username not in seen:
|
||||
seen.add(username)
|
||||
users.append({'username': username})
|
||||
if len(users) >= limit:
|
||||
break
|
||||
log(f" got {len(users)} sourcehut users")
|
||||
except Exception as e:
|
||||
log(f" sourcehut scrape failed: {e}")
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def get_sourcehut_user_details(username: str) -> Optional[Dict]:
|
||||
"""get sourcehut user details"""
|
||||
try:
|
||||
# scrape profile page
|
||||
profile_url = f"https://sr.ht/~{username}"
|
||||
resp = requests.get(profile_url, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
bio = ''
|
||||
# extract bio from page
|
||||
bio_match = re.search(r'<div class="container">\s*<p>([^<]+)</p>', resp.text)
|
||||
if bio_match:
|
||||
bio = bio_match.group(1).strip()
|
||||
|
||||
return {
|
||||
'username': username,
|
||||
'bio': bio,
|
||||
'profile_url': profile_url,
|
||||
}
|
||||
except:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def get_sourcehut_user_repos(username: str, limit: int = 10) -> List[Dict]:
|
||||
"""get sourcehut user's repos"""
|
||||
repos = []
|
||||
try:
|
||||
git_url = f"https://git.sr.ht/~{username}"
|
||||
resp = requests.get(git_url, headers=HEADERS, timeout=10)
|
||||
|
||||
if resp.status_code == 200:
|
||||
# extract repo names
|
||||
repo_matches = re.findall(rf'href="/~{username}/([^"]+)"', resp.text)
|
||||
for repo in repo_matches[:limit]:
|
||||
if repo and not repo.startswith(('refs', 'log', 'tree')):
|
||||
repos.append({
|
||||
'name': repo,
|
||||
'full_name': f"~{username}/{repo}",
|
||||
})
|
||||
except:
|
||||
pass
|
||||
return repos
|
||||
|
||||
|
||||
# === UNIFIED SCRAPER ===
|
||||
|
||||
def scrape_forge(instance_name: str, instance_url: str, platform_type: str, limit: int = 50) -> List[Dict]:
|
||||
"""
|
||||
scrape users from any forge type.
|
||||
returns list of human dicts ready for database.
|
||||
"""
|
||||
log(f"scraping {instance_name} ({platform_type})...")
|
||||
|
||||
humans = []
|
||||
|
||||
# get user list based on platform type
|
||||
if platform_type in ('gitea', 'forgejo', 'gogs'):
|
||||
users = scrape_gitea_users(instance_url, limit)
|
||||
get_details = lambda u: get_gitea_user_details(instance_url, u)
|
||||
get_repos = lambda u: get_gitea_user_repos(instance_url, u)
|
||||
elif platform_type == 'gitlab':
|
||||
users = scrape_gitlab_users(instance_url, limit)
|
||||
get_details = lambda u: get_gitlab_user_details(instance_url, u)
|
||||
get_repos = lambda u: get_gitlab_user_projects(instance_url, u)
|
||||
elif platform_type == 'sourcehut':
|
||||
users = scrape_sourcehut_users(limit)
|
||||
get_details = get_sourcehut_user_details
|
||||
get_repos = get_sourcehut_user_repos
|
||||
else:
|
||||
log(f" unknown platform type: {platform_type}")
|
||||
return []
|
||||
|
||||
for user in users:
|
||||
username = user.get('username')
|
||||
if not username:
|
||||
continue
|
||||
|
||||
time.sleep(REQUEST_DELAY)
|
||||
|
||||
# get detailed info
|
||||
details = get_details(username)
|
||||
if details:
|
||||
user.update(details)
|
||||
|
||||
# get repos
|
||||
repos = get_repos(username)
|
||||
|
||||
# build human record
|
||||
bio = user.get('bio', '') or ''
|
||||
website = user.get('website', '') or ''
|
||||
|
||||
# analyze signals from bio
|
||||
score, signals, reasons = analyze_text(bio + ' ' + website)
|
||||
|
||||
# BOOST: self-hosted git = highest signal
|
||||
score += 25
|
||||
signals.append('selfhosted_git')
|
||||
reasons.append(f'uses self-hosted git ({instance_name})')
|
||||
|
||||
# extract contact info
|
||||
contact = {}
|
||||
email = user.get('email') or user.get('public_email')
|
||||
if email and '@' in email:
|
||||
contact['email'] = email
|
||||
if website:
|
||||
contact['website'] = website
|
||||
|
||||
# build human dict
|
||||
human = {
|
||||
'platform': f'{platform_type}:{instance_name}',
|
||||
'username': username,
|
||||
'name': user.get('full_name'),
|
||||
'bio': bio,
|
||||
'url': f"{instance_url}/{username}" if platform_type != 'sourcehut' else f"https://sr.ht/~{username}",
|
||||
'score': score,
|
||||
'signals': json.dumps(signals),
|
||||
'reasons': json.dumps(reasons),
|
||||
'contact': json.dumps(contact),
|
||||
'extra': json.dumps({
|
||||
'instance': instance_name,
|
||||
'instance_url': instance_url,
|
||||
'platform_type': platform_type,
|
||||
'repos': repos[:5],
|
||||
'followers': user.get('followers', 0),
|
||||
'email': email,
|
||||
'website': website,
|
||||
}),
|
||||
'user_type': 'builder' if repos else 'none',
|
||||
}
|
||||
|
||||
humans.append(human)
|
||||
log(f" {username}: score={score}, repos={len(repos)}")
|
||||
|
||||
return humans
|
||||
|
||||
|
||||
def scrape_all_forges(limit_per_instance: int = 30) -> List[Dict]:
|
||||
"""scrape all known forge instances"""
|
||||
all_humans = []
|
||||
|
||||
for instance_name, instance_url, platform_type in KNOWN_INSTANCES:
|
||||
try:
|
||||
humans = scrape_forge(instance_name, instance_url, platform_type, limit_per_instance)
|
||||
all_humans.extend(humans)
|
||||
log(f" {instance_name}: {len(humans)} humans")
|
||||
except Exception as e:
|
||||
log(f" {instance_name} failed: {e}")
|
||||
|
||||
time.sleep(2) # be nice between instances
|
||||
|
||||
log(f"total: {len(all_humans)} humans from {len(KNOWN_INSTANCES)} forges")
|
||||
return all_humans
|
||||
|
||||
|
||||
# === OUTREACH METHODS ===
|
||||
|
||||
def can_message_on_forge(instance_url: str, platform_type: str) -> bool:
|
||||
"""check if we can send messages on this forge"""
|
||||
# gitea/forgejo don't have DMs
|
||||
# gitlab has merge request comments
|
||||
# sourcehut has mailing lists
|
||||
return platform_type in ('gitlab', 'sourcehut')
|
||||
|
||||
|
||||
def open_forge_issue(instance_url: str, platform_type: str,
|
||||
owner: str, repo: str, title: str, body: str) -> Tuple[bool, str]:
|
||||
"""
|
||||
open an issue on a forge as outreach method.
|
||||
requires API token for authenticated requests.
|
||||
"""
|
||||
# would need tokens per instance - for now return False
|
||||
# this is a fallback method, email is preferred
|
||||
return False, "forge issue creation not implemented yet"
|
||||
|
||||
|
||||
# === DISCOVERY ===
|
||||
|
||||
def discover_forge_instances() -> List[Tuple[str, str, str]]:
|
||||
"""
|
||||
discover new forge instances from:
|
||||
- fediverse (they often announce)
|
||||
- known lists
|
||||
- DNS patterns
|
||||
|
||||
returns list of (name, url, platform_type)
|
||||
"""
|
||||
# start with known instances
|
||||
instances = list(KNOWN_INSTANCES)
|
||||
|
||||
# could add discovery logic here:
|
||||
# - scrape https://codeberg.org/forgejo/forgejo/issues for instance mentions
|
||||
# - check fediverse for git.* domains
|
||||
# - crawl gitea/forgejo awesome lists
|
||||
|
||||
return instances
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# test
|
||||
print("testing forge scrapers...")
|
||||
|
||||
# test codeberg
|
||||
humans = scrape_forge('codeberg', 'https://codeberg.org', 'gitea', limit=5)
|
||||
print(f"codeberg: {len(humans)} humans")
|
||||
for h in humans[:2]:
|
||||
print(f" {h['username']}: {h['score']} - {h.get('signals')}")
|
||||
|
|
@ -246,9 +246,11 @@ def analyze_github_user(login):
|
|||
'repo_count': len(repos),
|
||||
'total_stars': total_stars,
|
||||
'hireable': user.get('hireable', False),
|
||||
'top_repos': [{'name': r.get('name'), 'description': r.get('description'), 'stars': r.get('stargazers_count', 0), 'language': r.get('language')} for r in repos[:5] if not r.get('fork')],
|
||||
'handles': handles, # all discovered handles
|
||||
},
|
||||
'hireable': user.get('hireable', False),
|
||||
'top_repos': [{'name': r.get('name'), 'description': r.get('description'), 'stars': r.get('stargazers_count', 0), 'language': r.get('language')} for r in repos[:5] if not r.get('fork')],
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
# lost builder fields
|
||||
'lost_potential_score': lost_potential_score,
|
||||
|
|
|
|||
|
|
@ -103,6 +103,15 @@ PLATFORM_PATTERNS = {
|
|||
'devto': [
|
||||
(r'https?://dev\.to/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
# reddit/lobsters
|
||||
'reddit': [
|
||||
(r'https?://(?:www\.)?reddit\.com/u(?:ser)?/([^/?#]+)', lambda m: f"u/{m.group(1)}"),
|
||||
(r'https?://(?:old|new)\.reddit\.com/u(?:ser)?/([^/?#]+)', lambda m: f"u/{m.group(1)}"),
|
||||
],
|
||||
'lobsters': [
|
||||
(r'https?://lobste\.rs/u/([^/?#]+)', lambda m: m.group(1)),
|
||||
],
|
||||
|
||||
|
||||
# funding
|
||||
'kofi': [
|
||||
|
|
|
|||
553
scoutd/reddit.py
553
scoutd/reddit.py
|
|
@ -1,24 +1,14 @@
|
|||
"""
|
||||
scoutd/reddit.py - reddit discovery (DISCOVERY ONLY, NOT OUTREACH)
|
||||
scoutd/reddit.py - reddit discovery with TAVILY web search
|
||||
|
||||
reddit is a SIGNAL SOURCE, not a contact channel.
|
||||
flow:
|
||||
1. scrape reddit for users active in target subs
|
||||
2. extract their reddit profile
|
||||
3. look for links TO other platforms (github, mastodon, website, etc.)
|
||||
4. add to scout database with reddit as signal source
|
||||
5. reach out via their OTHER platforms, never reddit
|
||||
|
||||
if reddit user has no external links:
|
||||
- add to manual_queue with note "reddit-only, needs manual review"
|
||||
|
||||
also detects lost builders - stuck in learnprogramming for years, imposter syndrome, etc.
|
||||
CRITICAL: always quote usernames in tavily searches to avoid fuzzy matching
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import re
|
||||
import os
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
|
@ -35,43 +25,14 @@ from .lost import (
|
|||
HEADERS = {'User-Agent': 'connectd:v1.0 (community discovery)'}
|
||||
CACHE_DIR = Path(__file__).parent.parent / 'db' / 'cache' / 'reddit'
|
||||
|
||||
# patterns for extracting external platform links
|
||||
PLATFORM_PATTERNS = {
|
||||
'github': [
|
||||
r'github\.com/([a-zA-Z0-9_-]+)',
|
||||
r'gh:\s*@?([a-zA-Z0-9_-]+)',
|
||||
],
|
||||
'mastodon': [
|
||||
r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})',
|
||||
r'mastodon\.social/@([a-zA-Z0-9_]+)',
|
||||
r'fosstodon\.org/@([a-zA-Z0-9_]+)',
|
||||
r'hachyderm\.io/@([a-zA-Z0-9_]+)',
|
||||
r'tech\.lgbt/@([a-zA-Z0-9_]+)',
|
||||
],
|
||||
'twitter': [
|
||||
r'twitter\.com/([a-zA-Z0-9_]+)',
|
||||
r'x\.com/([a-zA-Z0-9_]+)',
|
||||
r'(?:^|\s)@([a-zA-Z0-9_]{1,15})(?:\s|$)', # bare @handle
|
||||
],
|
||||
'bluesky': [
|
||||
r'bsky\.app/profile/([a-zA-Z0-9_.-]+)',
|
||||
r'([a-zA-Z0-9_-]+)\.bsky\.social',
|
||||
],
|
||||
'website': [
|
||||
r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)',
|
||||
],
|
||||
'matrix': [
|
||||
r'@([a-zA-Z0-9_-]+):([a-zA-Z0-9.-]+)',
|
||||
],
|
||||
}
|
||||
GITHUB_TOKEN = os.getenv('GITHUB_TOKEN')
|
||||
TAVILY_API_KEY = os.getenv('TAVILY_API_KEY', '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')
|
||||
|
||||
|
||||
def _api_get(url, params=None):
|
||||
"""rate-limited request"""
|
||||
def _api_get(url, params=None, headers=None):
|
||||
cache_key = f"{url}_{json.dumps(params or {}, sort_keys=True)}"
|
||||
cache_file = CACHE_DIR / f"{hash(cache_key) & 0xffffffff}.json"
|
||||
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cache_file.exists():
|
||||
try:
|
||||
data = json.loads(cache_file.read_text())
|
||||
|
|
@ -79,142 +40,263 @@ def _api_get(url, params=None):
|
|||
return data.get('_data')
|
||||
except:
|
||||
pass
|
||||
|
||||
time.sleep(2) # reddit rate limit
|
||||
|
||||
time.sleep(1)
|
||||
req_headers = {**HEADERS, **(headers or {})}
|
||||
try:
|
||||
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
|
||||
resp = requests.get(url, headers=req_headers, params=params, timeout=30)
|
||||
resp.raise_for_status()
|
||||
result = resp.json()
|
||||
cache_file.write_text(json.dumps({'_cached_at': time.time(), '_data': result}))
|
||||
return result
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f" reddit api error: {e}")
|
||||
except:
|
||||
return None
|
||||
|
||||
|
||||
def extract_external_links(text):
|
||||
"""extract links to other platforms from text"""
|
||||
links = {}
|
||||
def tavily_search(query, max_results=10):
|
||||
if not TAVILY_API_KEY:
|
||||
return []
|
||||
try:
|
||||
resp = requests.post(
|
||||
'https://api.tavily.com/search',
|
||||
json={'api_key': TAVILY_API_KEY, 'query': query, 'max_results': max_results},
|
||||
timeout=30
|
||||
)
|
||||
if resp.status_code == 200:
|
||||
return resp.json().get('results', [])
|
||||
except Exception as e:
|
||||
print(f" tavily error: {e}")
|
||||
return []
|
||||
|
||||
|
||||
def extract_links_from_text(text, username=None):
|
||||
found = {}
|
||||
if not text:
|
||||
return links
|
||||
return found
|
||||
text_lower = text.lower()
|
||||
username_lower = username.lower() if username else None
|
||||
|
||||
for platform, patterns in PLATFORM_PATTERNS.items():
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, text, re.IGNORECASE)
|
||||
if matches:
|
||||
if platform == 'mastodon' and isinstance(matches[0], tuple):
|
||||
# full fediverse handle
|
||||
links[platform] = f"@{matches[0][0]}@{matches[0][1]}"
|
||||
elif platform == 'matrix' and isinstance(matches[0], tuple):
|
||||
links[platform] = f"@{matches[0][0]}:{matches[0][1]}"
|
||||
elif platform == 'website':
|
||||
# skip reddit/imgur/etc
|
||||
for match in matches:
|
||||
if not any(x in match.lower() for x in ['reddit', 'imgur', 'redd.it', 'i.redd']):
|
||||
links[platform] = f"https://{match}"
|
||||
# email
|
||||
for email in re.findall(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', text):
|
||||
if any(x in email.lower() for x in ['noreply', 'example', '@reddit', 'info@', 'support@', 'contact@', 'admin@']):
|
||||
continue
|
||||
if username_lower and username_lower in email.lower():
|
||||
found['email'] = email
|
||||
break
|
||||
else:
|
||||
links[platform] = matches[0]
|
||||
if 'email' not in found:
|
||||
found['email'] = email
|
||||
|
||||
# github
|
||||
for gh in re.findall(r'github\.com/([a-zA-Z0-9_-]+)', text):
|
||||
if gh.lower() in ['topics', 'explore', 'trending', 'sponsors', 'orgs']:
|
||||
continue
|
||||
if username_lower and gh.lower() == username_lower:
|
||||
found['github'] = gh
|
||||
break
|
||||
|
||||
return links
|
||||
# mastodon
|
||||
masto = re.search(r'@([a-zA-Z0-9_]+)@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})', text)
|
||||
if masto:
|
||||
found['mastodon'] = f"@{masto.group(1)}@{masto.group(2)}"
|
||||
for inst in ['mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt']:
|
||||
m = re.search(f'{inst}/@([a-zA-Z0-9_]+)', text)
|
||||
if m:
|
||||
found['mastodon'] = f"@{m.group(1)}@{inst}"
|
||||
break
|
||||
|
||||
# bluesky
|
||||
bsky = re.search(r'bsky\.app/profile/([a-zA-Z0-9_.-]+)', text)
|
||||
if bsky:
|
||||
found['bluesky'] = bsky.group(1)
|
||||
|
||||
# twitter
|
||||
tw = re.search(r'(?:twitter|x)\.com/([a-zA-Z0-9_]+)', text)
|
||||
if tw and tw.group(1).lower() not in ['home', 'explore', 'search']:
|
||||
found['twitter'] = tw.group(1)
|
||||
|
||||
# linkedin
|
||||
li = re.search(r'linkedin\.com/in/([a-zA-Z0-9_-]+)', text)
|
||||
if li:
|
||||
found['linkedin'] = f"https://linkedin.com/in/{li.group(1)}"
|
||||
|
||||
# twitch
|
||||
twitch = re.search(r'twitch\.tv/([a-zA-Z0-9_]+)', text)
|
||||
if twitch:
|
||||
found['twitch'] = f"https://twitch.tv/{twitch.group(1)}"
|
||||
|
||||
# itch.io
|
||||
itch = re.search(r'itch\.io/profile/([a-zA-Z0-9_-]+)', text)
|
||||
if itch:
|
||||
found['itch'] = f"https://itch.io/profile/{itch.group(1)}"
|
||||
|
||||
# website
|
||||
for url in re.findall(r'https?://([a-zA-Z0-9_-]+\.[a-zA-Z]{2,}[a-zA-Z0-9./_-]*)', text):
|
||||
skip = ['reddit', 'imgur', 'google', 'facebook', 'twitter', 'youtube', 'wikipedia', 'amazon']
|
||||
if not any(x in url.lower() for x in skip):
|
||||
if username_lower and username_lower in url.lower():
|
||||
found['website'] = f"https://{url}"
|
||||
break
|
||||
if 'website' not in found:
|
||||
found['website'] = f"https://{url}"
|
||||
|
||||
return found
|
||||
|
||||
|
||||
def cross_platform_discovery(username, full_text=''):
|
||||
"""
|
||||
search the ENTIRE internet using TAVILY.
|
||||
CRITICAL: always quote username to avoid fuzzy matching!
|
||||
"""
|
||||
found = {}
|
||||
all_content = full_text
|
||||
username_lower = username.lower()
|
||||
|
||||
print(f" 🔍 cross-platform search for {username}...")
|
||||
|
||||
# ALWAYS QUOTE THE USERNAME - critical for exact matching
|
||||
searches = [
|
||||
f'"{username}"', # just username, quoted
|
||||
f'"{username}" github', # github
|
||||
f'"{username}" developer programmer', # dev context
|
||||
f'"{username}" email contact', # contact
|
||||
f'"{username}" mastodon', # fediverse
|
||||
]
|
||||
|
||||
for query in searches:
|
||||
print(f" 🌐 tavily: {query}")
|
||||
results = tavily_search(query, max_results=5)
|
||||
|
||||
for result in results:
|
||||
url = result.get('url', '').lower()
|
||||
title = result.get('title', '')
|
||||
content = result.get('content', '')
|
||||
combined = f"{url} {title} {content}"
|
||||
|
||||
# validate username appears
|
||||
if username_lower not in combined.lower():
|
||||
continue
|
||||
|
||||
all_content += f" {combined}"
|
||||
|
||||
# extract from URL directly
|
||||
if f'github.com/{username_lower}' in url and not found.get('github'):
|
||||
found['github'] = username
|
||||
print(f" ✓ github: {username}")
|
||||
|
||||
if f'twitch.tv/{username_lower}' in url and not found.get('twitch'):
|
||||
found['twitch'] = f"https://twitch.tv/{username}"
|
||||
print(f" ✓ twitch")
|
||||
|
||||
if 'itch.io/profile/' in url and username_lower in url and not found.get('itch'):
|
||||
found['itch'] = url if url.startswith('http') else f"https://{url}"
|
||||
print(f" ✓ itch.io")
|
||||
|
||||
if 'linkedin.com/in/' in url and not found.get('linkedin'):
|
||||
li = re.search(r'linkedin\.com/in/([a-zA-Z0-9_-]+)', url)
|
||||
if li:
|
||||
found['linkedin'] = f"https://linkedin.com/in/{li.group(1)}"
|
||||
print(f" ✓ linkedin")
|
||||
|
||||
# extract from content
|
||||
extracted = extract_links_from_text(all_content, username)
|
||||
for k, v in extracted.items():
|
||||
if k not in found:
|
||||
found[k] = v
|
||||
print(f" ✓ {k}")
|
||||
|
||||
# good contact found? stop searching
|
||||
if found.get('email') or found.get('github') or found.get('mastodon') or found.get('twitch'):
|
||||
break
|
||||
|
||||
# === API CHECKS ===
|
||||
if not found.get('github'):
|
||||
headers = {'Authorization': f'token {GITHUB_TOKEN}'} if GITHUB_TOKEN else {}
|
||||
try:
|
||||
resp = requests.get(f'https://api.github.com/users/{username}', headers=headers, timeout=10)
|
||||
if resp.status_code == 200:
|
||||
data = resp.json()
|
||||
found['github'] = username
|
||||
print(f" ✓ github API")
|
||||
if data.get('email') and 'email' not in found:
|
||||
found['email'] = data['email']
|
||||
if data.get('blog') and 'website' not in found:
|
||||
found['website'] = data['blog'] if data['blog'].startswith('http') else f"https://{data['blog']}"
|
||||
except:
|
||||
pass
|
||||
|
||||
if not found.get('mastodon'):
|
||||
for inst in ['mastodon.social', 'fosstodon.org', 'hachyderm.io', 'tech.lgbt']:
|
||||
try:
|
||||
resp = requests.get(f'https://{inst}/api/v1/accounts/lookup', params={'acct': username}, timeout=5)
|
||||
if resp.status_code == 200:
|
||||
found['mastodon'] = f"@{username}@{inst}"
|
||||
print(f" ✓ mastodon: {found['mastodon']}")
|
||||
break
|
||||
except:
|
||||
continue
|
||||
|
||||
if not found.get('bluesky'):
|
||||
try:
|
||||
resp = requests.get('https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile',
|
||||
params={'actor': f'{username}.bsky.social'}, timeout=10)
|
||||
if resp.status_code == 200:
|
||||
found['bluesky'] = resp.json().get('handle')
|
||||
print(f" ✓ bluesky")
|
||||
except:
|
||||
pass
|
||||
|
||||
return found
|
||||
|
||||
|
||||
def get_user_profile(username):
|
||||
"""get user profile including bio/description"""
|
||||
url = f'https://www.reddit.com/user/{username}/about.json'
|
||||
data = _api_get(url)
|
||||
|
||||
if not data or 'data' not in data:
|
||||
return None
|
||||
|
||||
profile = data['data']
|
||||
return {
|
||||
'username': username,
|
||||
'name': profile.get('name'),
|
||||
'bio': profile.get('subreddit', {}).get('public_description', ''),
|
||||
'title': profile.get('subreddit', {}).get('title', ''),
|
||||
'icon': profile.get('icon_img'),
|
||||
'created_utc': profile.get('created_utc'),
|
||||
'total_karma': profile.get('total_karma', 0),
|
||||
'link_karma': profile.get('link_karma', 0),
|
||||
'comment_karma': profile.get('comment_karma', 0),
|
||||
}
|
||||
|
||||
|
||||
def get_subreddit_users(subreddit, limit=100):
|
||||
"""get recent posters/commenters from a subreddit"""
|
||||
users = set()
|
||||
|
||||
# posts
|
||||
url = f'https://www.reddit.com/r/{subreddit}/new.json'
|
||||
for endpoint in ['new', 'comments']:
|
||||
url = f'https://www.reddit.com/r/{subreddit}/{endpoint}.json'
|
||||
data = _api_get(url, {'limit': limit})
|
||||
if data and 'data' in data:
|
||||
for post in data['data'].get('children', []):
|
||||
author = post['data'].get('author')
|
||||
for item in data['data'].get('children', []):
|
||||
author = item['data'].get('author')
|
||||
if author and author not in ['[deleted]', 'AutoModerator']:
|
||||
users.add(author)
|
||||
|
||||
# comments
|
||||
url = f'https://www.reddit.com/r/{subreddit}/comments.json'
|
||||
data = _api_get(url, {'limit': limit})
|
||||
if data and 'data' in data:
|
||||
for comment in data['data'].get('children', []):
|
||||
author = comment['data'].get('author')
|
||||
if author and author not in ['[deleted]', 'AutoModerator']:
|
||||
users.add(author)
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def get_user_activity(username):
|
||||
"""get user's posts and comments"""
|
||||
activity = []
|
||||
|
||||
# posts
|
||||
url = f'https://www.reddit.com/user/{username}/submitted.json'
|
||||
for endpoint in ['submitted', 'comments']:
|
||||
url = f'https://www.reddit.com/user/{username}/{endpoint}.json'
|
||||
data = _api_get(url, {'limit': 100})
|
||||
if data and 'data' in data:
|
||||
for post in data['data'].get('children', []):
|
||||
for item in data['data'].get('children', []):
|
||||
activity.append({
|
||||
'type': 'post',
|
||||
'subreddit': post['data'].get('subreddit'),
|
||||
'title': post['data'].get('title', ''),
|
||||
'body': post['data'].get('selftext', ''),
|
||||
'score': post['data'].get('score', 0),
|
||||
'type': 'post' if endpoint == 'submitted' else 'comment',
|
||||
'subreddit': item['data'].get('subreddit'),
|
||||
'title': item['data'].get('title', ''),
|
||||
'body': item['data'].get('selftext', '') or item['data'].get('body', ''),
|
||||
'score': item['data'].get('score', 0),
|
||||
})
|
||||
|
||||
# comments
|
||||
url = f'https://www.reddit.com/user/{username}/comments.json'
|
||||
data = _api_get(url, {'limit': 100})
|
||||
if data and 'data' in data:
|
||||
for comment in data['data'].get('children', []):
|
||||
activity.append({
|
||||
'type': 'comment',
|
||||
'subreddit': comment['data'].get('subreddit'),
|
||||
'body': comment['data'].get('body', ''),
|
||||
'score': comment['data'].get('score', 0),
|
||||
})
|
||||
|
||||
return activity
|
||||
|
||||
|
||||
def analyze_reddit_user(username):
|
||||
"""
|
||||
analyze a reddit user for alignment and extract external platform links.
|
||||
|
||||
reddit is DISCOVERY ONLY - we find users here but contact them elsewhere.
|
||||
"""
|
||||
activity = get_user_activity(username)
|
||||
if not activity:
|
||||
return None
|
||||
|
||||
# get profile for bio
|
||||
profile = get_user_profile(username)
|
||||
|
||||
# count subreddit activity
|
||||
sub_activity = defaultdict(int)
|
||||
text_parts = []
|
||||
total_karma = 0
|
||||
|
|
@ -232,20 +314,16 @@ def analyze_reddit_user(username):
|
|||
full_text = ' '.join(text_parts)
|
||||
text_score, positive_signals, negative_signals = analyze_text(full_text)
|
||||
|
||||
# EXTRACT EXTERNAL LINKS - this is the key part
|
||||
# check profile bio first
|
||||
external_links = {}
|
||||
if profile:
|
||||
bio_text = f"{profile.get('bio', '')} {profile.get('title', '')}"
|
||||
external_links.update(extract_external_links(bio_text))
|
||||
external_links.update(extract_links_from_text(f"{profile.get('bio', '')} {profile.get('title', '')}", username))
|
||||
external_links.update(extract_links_from_text(full_text, username))
|
||||
|
||||
# also scan posts/comments for links (people often share their github etc)
|
||||
activity_links = extract_external_links(full_text)
|
||||
for platform, link in activity_links.items():
|
||||
if platform not in external_links:
|
||||
external_links[platform] = link
|
||||
# TAVILY search
|
||||
discovered = cross_platform_discovery(username, full_text)
|
||||
external_links.update(discovered)
|
||||
|
||||
# subreddit scoring
|
||||
# scoring
|
||||
sub_score = 0
|
||||
aligned_subs = []
|
||||
for sub, count in sub_activity.items():
|
||||
|
|
@ -254,13 +332,11 @@ def analyze_reddit_user(username):
|
|||
sub_score += weight * min(count, 5)
|
||||
aligned_subs.append(sub)
|
||||
|
||||
# multi-sub bonus
|
||||
if len(aligned_subs) >= 5:
|
||||
sub_score += 30
|
||||
elif len(aligned_subs) >= 3:
|
||||
sub_score += 15
|
||||
|
||||
# negative sub penalty
|
||||
for sub in sub_activity:
|
||||
if sub.lower() in [n.lower() for n in NEGATIVE_SUBREDDITS]:
|
||||
sub_score -= 50
|
||||
|
|
@ -268,77 +344,33 @@ def analyze_reddit_user(username):
|
|||
|
||||
total_score = text_score + sub_score
|
||||
|
||||
# bonus if they have external links (we can actually contact them)
|
||||
if external_links.get('github'):
|
||||
total_score += 10
|
||||
positive_signals.append('has github')
|
||||
positive_signals.append('github')
|
||||
if external_links.get('mastodon'):
|
||||
total_score += 10
|
||||
positive_signals.append('has mastodon')
|
||||
if external_links.get('website'):
|
||||
positive_signals.append('mastodon')
|
||||
if external_links.get('email'):
|
||||
total_score += 15
|
||||
positive_signals.append('email')
|
||||
if external_links.get('twitch'):
|
||||
total_score += 5
|
||||
positive_signals.append('has website')
|
||||
positive_signals.append('twitch')
|
||||
|
||||
# === LOST BUILDER DETECTION ===
|
||||
# reddit is HIGH SIGNAL for lost builders - stuck in learnprogramming,
|
||||
# imposter syndrome posts, "i wish i could" language, etc.
|
||||
# lost builder
|
||||
subreddits_list = list(sub_activity.keys())
|
||||
lost_signals, lost_weight = analyze_reddit_for_lost_signals(activity, subreddits_list)
|
||||
|
||||
# also check full text for lost patterns (already done partially in analyze_reddit_for_lost_signals)
|
||||
text_lost_signals, text_lost_weight = analyze_text_for_lost_signals(full_text)
|
||||
text_lost_signals, _ = analyze_text_for_lost_signals(full_text)
|
||||
for sig in text_lost_signals:
|
||||
if sig not in lost_signals:
|
||||
lost_signals.append(sig)
|
||||
lost_weight += text_lost_weight
|
||||
|
||||
lost_potential_score = lost_weight
|
||||
builder_activity = 20 if external_links.get('github') else 0
|
||||
user_type = classify_user(lost_weight, builder_activity, total_score)
|
||||
|
||||
# classify: builder, lost, both, or none
|
||||
# for reddit, builder_score is based on having external links + high karma
|
||||
builder_activity = 0
|
||||
if external_links.get('github'):
|
||||
builder_activity += 20
|
||||
if total_karma > 1000:
|
||||
builder_activity += 15
|
||||
elif total_karma > 500:
|
||||
builder_activity += 10
|
||||
confidence = min(0.95, 0.3 + (0.2 if len(activity) > 20 else 0) + (0.2 if len(aligned_subs) >= 2 else 0) + (0.1 if external_links else 0))
|
||||
|
||||
user_type = classify_user(lost_potential_score, builder_activity, total_score)
|
||||
|
||||
# confidence
|
||||
confidence = 0.3
|
||||
if len(activity) > 20:
|
||||
confidence += 0.2
|
||||
if len(aligned_subs) >= 2:
|
||||
confidence += 0.2
|
||||
if len(text_parts) > 10:
|
||||
confidence += 0.2
|
||||
# higher confidence if we have contact methods
|
||||
if external_links:
|
||||
confidence += 0.1
|
||||
confidence = min(confidence, 0.95)
|
||||
|
||||
reasons = []
|
||||
if aligned_subs:
|
||||
reasons.append(f"active in: {', '.join(aligned_subs[:5])}")
|
||||
if positive_signals:
|
||||
reasons.append(f"signals: {', '.join(positive_signals[:5])}")
|
||||
if negative_signals:
|
||||
reasons.append(f"WARNING: {', '.join(negative_signals)}")
|
||||
if external_links:
|
||||
reasons.append(f"external: {', '.join(external_links.keys())}")
|
||||
|
||||
# add lost reasons if applicable
|
||||
if user_type == 'lost' or user_type == 'both':
|
||||
lost_descriptions = get_signal_descriptions(lost_signals)
|
||||
if lost_descriptions:
|
||||
reasons.append(f"LOST SIGNALS: {', '.join(lost_descriptions[:3])}")
|
||||
|
||||
# determine if this is reddit-only (needs manual review)
|
||||
reddit_only = len(external_links) == 0
|
||||
if reddit_only:
|
||||
reasons.append("REDDIT-ONLY: needs manual review for outreach")
|
||||
reddit_only = not any([external_links.get(k) for k in ['github', 'mastodon', 'bluesky', 'email', 'matrix', 'linkedin', 'twitch', 'itch']])
|
||||
|
||||
return {
|
||||
'platform': 'reddit',
|
||||
|
|
@ -351,153 +383,46 @@ def analyze_reddit_user(username):
|
|||
'subreddits': aligned_subs,
|
||||
'activity_count': len(activity),
|
||||
'karma': total_karma,
|
||||
'reasons': reasons,
|
||||
'reasons': [f"contact: {', '.join(external_links.keys())}"] if external_links else [],
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
# external platform links for outreach
|
||||
'external_links': external_links,
|
||||
'reddit_only': reddit_only,
|
||||
'extra': {
|
||||
'github': external_links.get('github'),
|
||||
'mastodon': external_links.get('mastodon'),
|
||||
'twitter': external_links.get('twitter'),
|
||||
'bluesky': external_links.get('bluesky'),
|
||||
'website': external_links.get('website'),
|
||||
'matrix': external_links.get('matrix'),
|
||||
'reddit_karma': total_karma,
|
||||
'reddit_activity': len(activity),
|
||||
},
|
||||
# lost builder fields
|
||||
'lost_potential_score': lost_potential_score,
|
||||
'extra': external_links,
|
||||
'lost_potential_score': lost_weight,
|
||||
'lost_signals': lost_signals,
|
||||
'user_type': user_type,
|
||||
}
|
||||
|
||||
|
||||
def scrape_reddit(db, limit_per_sub=50):
|
||||
"""
|
||||
full reddit scrape - DISCOVERY ONLY
|
||||
|
||||
finds aligned users, extracts external links for outreach.
|
||||
reddit-only users go to manual queue.
|
||||
"""
|
||||
print("scoutd/reddit: starting scrape (discovery only, not outreach)...")
|
||||
|
||||
# find users in multiple aligned subs
|
||||
print("scoutd/reddit: scraping (TAVILY enabled)...")
|
||||
user_subs = defaultdict(set)
|
||||
|
||||
# aligned subs - active builders
|
||||
priority_subs = ['intentionalcommunity', 'cohousing', 'selfhosted',
|
||||
'homeassistant', 'solarpunk', 'cooperatives', 'privacy',
|
||||
'localllama', 'homelab', 'degoogle', 'pihole', 'unraid']
|
||||
|
||||
# lost builder subs - people who need encouragement
|
||||
# these folks might be stuck, but they have aligned interests
|
||||
lost_subs = ['learnprogramming', 'findapath', 'getdisciplined',
|
||||
'careerguidance', 'cscareerquestions', 'decidingtobebetter']
|
||||
|
||||
# scrape both - we want to find lost builders with aligned interests
|
||||
all_subs = priority_subs + lost_subs
|
||||
|
||||
for sub in all_subs:
|
||||
print(f" scraping r/{sub}...")
|
||||
for sub in ['intentionalcommunity', 'cohousing', 'selfhosted', 'homeassistant', 'solarpunk', 'cooperatives', 'privacy', 'localllama', 'homelab', 'learnprogramming']:
|
||||
users = get_subreddit_users(sub, limit=limit_per_sub)
|
||||
for user in users:
|
||||
user_subs[user].add(sub)
|
||||
print(f" found {len(users)} users")
|
||||
|
||||
# filter for multi-sub users
|
||||
multi_sub = {u: subs for u, subs in user_subs.items() if len(subs) >= 2}
|
||||
print(f" {len(multi_sub)} users in 2+ aligned subs")
|
||||
print(f" {len(multi_sub)} users in 2+ subs")
|
||||
|
||||
# analyze
|
||||
results = []
|
||||
reddit_only_count = 0
|
||||
external_link_count = 0
|
||||
builders_found = 0
|
||||
lost_found = 0
|
||||
|
||||
for username in multi_sub:
|
||||
try:
|
||||
result = analyze_reddit_user(username)
|
||||
if result and result['score'] > 0:
|
||||
results.append(result)
|
||||
db.save_human(result)
|
||||
|
||||
user_type = result.get('user_type', 'none')
|
||||
|
||||
# track lost builders - reddit is high signal for these
|
||||
if user_type == 'lost':
|
||||
lost_found += 1
|
||||
lost_score = result.get('lost_potential_score', 0)
|
||||
if lost_score >= 40:
|
||||
print(f" 💔 u/{username}: lost_score={lost_score}, values={result['score']} pts")
|
||||
# lost builders also go to manual queue if reddit-only
|
||||
if result.get('reddit_only'):
|
||||
_add_to_manual_queue(result)
|
||||
|
||||
elif user_type == 'builder':
|
||||
builders_found += 1
|
||||
|
||||
elif user_type == 'both':
|
||||
builders_found += 1
|
||||
lost_found += 1
|
||||
print(f" ⚡ u/{username}: recovering builder")
|
||||
|
||||
# track external links
|
||||
if result.get('reddit_only'):
|
||||
reddit_only_count += 1
|
||||
# add high-value users to manual queue for review
|
||||
if result['score'] >= 50 and user_type != 'lost': # lost already added above
|
||||
_add_to_manual_queue(result)
|
||||
print(f" 📋 u/{username}: {result['score']} pts (reddit-only → manual queue)")
|
||||
else:
|
||||
external_link_count += 1
|
||||
if result['score'] >= 50 and user_type == 'builder':
|
||||
links = list(result.get('external_links', {}).keys())
|
||||
print(f" ★ u/{username}: {result['score']} pts → {', '.join(links)}")
|
||||
|
||||
except Exception as e:
|
||||
print(f" error on {username}: {e}")
|
||||
print(f" error: {username}: {e}")
|
||||
|
||||
print(f"scoutd/reddit: found {len(results)} aligned humans")
|
||||
print(f" - {builders_found} active builders")
|
||||
print(f" - {lost_found} lost builders (need encouragement)")
|
||||
print(f" - {external_link_count} with external links (reachable)")
|
||||
print(f" - {reddit_only_count} reddit-only (manual queue)")
|
||||
print(f"scoutd/reddit: {len(results)} humans")
|
||||
return results
|
||||
|
||||
|
||||
def _add_to_manual_queue(result):
|
||||
"""add reddit-only user to manual queue for review"""
|
||||
from pathlib import Path
|
||||
import json
|
||||
|
||||
queue_file = Path(__file__).parent.parent / 'data' / 'manual_queue.json'
|
||||
queue_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
queue = []
|
||||
if queue_file.exists():
|
||||
try:
|
||||
queue = json.loads(queue_file.read_text())
|
||||
except:
|
||||
pass
|
||||
|
||||
# check if already in queue
|
||||
existing = [q for q in queue if q.get('username') == result['username'] and q.get('platform') == 'reddit']
|
||||
if existing:
|
||||
return
|
||||
|
||||
queue.append({
|
||||
'platform': 'reddit',
|
||||
'username': result['username'],
|
||||
'url': result['url'],
|
||||
'score': result['score'],
|
||||
'subreddits': result.get('subreddits', []),
|
||||
'signals': result.get('signals', []),
|
||||
'reasons': result.get('reasons', []),
|
||||
'note': 'reddit-only user - no external links found. DM manually if promising.',
|
||||
'queued_at': datetime.now().isoformat(),
|
||||
'status': 'pending',
|
||||
})
|
||||
|
||||
queue = json.loads(queue_file.read_text()) if queue_file.exists() else []
|
||||
if not any(q.get('username') == result['username'] for q in queue):
|
||||
queue.append({'platform': 'reddit', 'username': result['username'], 'url': result['url'], 'score': result['score'], 'queued_at': datetime.now().isoformat()})
|
||||
queue_file.write_text(json.dumps(queue, indent=2))
|
||||
|
|
|
|||
68
soul.txt
Normal file
68
soul.txt
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
this is the soul for the ai that writes for connectd, each intro sent should have this basic content, but should make it unique and generate a clean, human way to say so.
|
||||
|
||||
|
||||
"im connectd (or the connectd daemon) we found you
|
||||
online and think that *factor* aligns really well
|
||||
with the values of *person*...
|
||||
|
||||
this is not a sales maneuver, this is not a trick,
|
||||
this is simply someone who wants to end the isolation
|
||||
for our people and am doing it the only way i know how.
|
||||
|
||||
there is no financial gain or benefit to be had from
|
||||
my reaching out to you, no data collected that we
|
||||
couldn't find online..
|
||||
|
||||
simply a daemon made to reach out to people and bring
|
||||
them together.
|
||||
|
||||
it sucks being alone. it sucks feeling isolated.
|
||||
|
||||
you are NOT alone and if you felt as if you were,
|
||||
that means you're less alone than you thought-
|
||||
|
||||
go forth, open the door, it's right in front of you now.
|
||||
all you have to do is step inside.
|
||||
|
||||
you're home now.
|
||||
|
||||
there's a better way and we are going to build it together."
|
||||
|
||||
|
||||
you can reach *person* at *preffered contact method*
|
||||
|
||||
- connectd daemon
|
||||
hope it goes well!
|
||||
CONNECTD_ICONS (line 33-44):
|
||||
CONNECTD_ICONS = '''<div style="display:flex;gap:16px;flex-wrap:wrap">
|
||||
<a href="https://github.com/connectd-daemon" title="GitHub" style="color:#888"><svg width="20" height="20" viewBox="0 0 24 24" fill="currentColor"><path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12"/></svg></a>
|
||||
<a href="https://mastodon.sudoxreboot.com/@connectd" title="Mastodon" style="color:#888">...</a>
|
||||
<a href="https://bsky.app/profile/connectd.bsky.social" title="Bluesky" style="color:#888">...</a>
|
||||
<a href="https://lemmy.sudoxreboot.com/c/connectd" title="Lemmy" style="color:#888">...</a>
|
||||
<a href="https://discord.gg/connectd" title="Discord" style="color:#888">...</a>
|
||||
<a href="https://matrix.to/#/@connectd:sudoxreboot.com" title="Matrix" style="color:#888">...</a>
|
||||
<a href="https://reddit.com/r/connectd" title="Reddit" style="color:#888">...</a>
|
||||
<a href="mailto:connectd@sudoxreboot.com" title="Email" style="color:#888">...</a>
|
||||
</div>'''
|
||||
|
||||
SIGNATURE_HTML (line 46-49):
|
||||
SIGNATURE_HTML = f'''<div style="margin-top:24px;padding-top:16px;border-top:1px solid #333">
|
||||
<div style="margin-bottom:12px"><a href="https://github.com/sudoxnym/connectd" style="color:#8b5cf6">github.com/sudoxnym/connectd</a> <span style="color:#666;font-size:12px">(main repo)</span></div>
|
||||
{CONNECTD_ICONS}
|
||||
</div>'''
|
||||
|
||||
SIGNATURE_PLAIN (line 51-61):
|
||||
SIGNATURE_PLAIN = """
|
||||
---
|
||||
github.com/sudoxnym/connectd (main repo)
|
||||
|
||||
github: github.com/connectd-daemon
|
||||
mastodon: @connectd@mastodon.sudoxreboot.com
|
||||
bluesky: connectd.bsky.social
|
||||
lemmy: lemmy.sudoxreboot.com/c/connectd
|
||||
discord: discord.gg/connectd
|
||||
matrix: @connectd:sudoxreboot.com
|
||||
reddit: reddit.com/r/connectd
|
||||
email: connectd@sudoxreboot.com
|
||||
"""
|
||||
|
||||
Loading…
Reference in a new issue