I’ve recently had the desire to build myself a daily digest. With the express intent to ensure all data is processed locally. I’m going to use this blog post to detail this project.
Before we go over how this was achieved, here’s a lightly anonymized example of an actual digest that landed in my inbox:
Weather
Today's high is 48°F with a low of 23°F — a significant jump from yesterday's rainy 38°F high and 30°F low.
Dinner Tonight
Salmon with sweet potatoes and green beans — one sheet pan
Calendar (next 7 days)
- Wed Apr 22 at 2:30 PM | Pick up kids from school early
- Thu Apr 23 at 3:30 PM | Doctor's appointment
- Mon Apr 27 | Take leftovers out from chest freezer
- Mon Apr 27 at 8:00 PM | Game night
- Tue Apr 28 at 5:00 PM | Put trash & recycling bins out by street
Email Highlights
Job Search
- Action Required: Sign NDA via Dropbox Sign (requested by recruiter).
- Interview Confirmed: Interview for "Super Fantastic Software Engineer" on Thu Apr 22 at 1:00 PM (EDT).
Financial & Tech
- Automatic payment confirmed for gym membership ($49.99/month).
Tasks Due Soon
- Overdue: Feed sourdough starter (due Apr 20).
Tech Feed
Reddit r/localllama
-
Claude Code removed from Claude Pro plan - better time than ever to switch to Local Models.
- Article: Anthropic removed Claude Code from its Pro plan, prompting users to consider switching to local models.
- Discussion: Community debates whether this signals a strategic pivot toward sustainability or financial instability ahead of an anticipated IPO.
-
Llama.cpp's auto fit works much better than I expected
- Article: Llama.cpp's auto fit feature significantly improves VRAM efficiency and token processing speed with quantization techniques.
- Discussion: Users debate optimal quantization levels (e.g., Q8_0 vs Q4) and GPU-specific configurations.
-
Personal Eval follow-up: Gemma4 26B MoE (Q8) vs Qwen3.5 27B Dense vs Gemma4 31B Dense Compared
- Article: A comparison of Gemma4's MoE and dense models against Qwen3.5's dense model highlights performance differences in coding tasks.
- Discussion: Debate over whether MoE architectures outperform dense models for local agents.
Hacker News
-
Show HN: GoModel – an open-source AI gateway in Go | discussion
- Article: GoModel is an open-source AI gateway built in Go, designed to unify interactions with various AI providers.
- Discussion: Developers praise its potential for unifying AI APIs but debate challenges like handling provider-specific inconsistencies and staying updated with API changes.
-
Anthropic says OpenClaw-style Claude CLI usage is allowed again | discussion
- Article: Anthropic confirms OpenClaw-style Claude CLI usage is permitted again.
- Discussion: Users express frustration over Anthropic's inconsistent messaging and potential hidden costs for custom prompts.
Made with ♥ by dungeon using openai:gemma-4-26b in 184s
Hardware
I’ll be running this system on a Macbook Pro with an M3 Pro CPU with 36 GB of RAM. It will be acting as an always on server. I’m currently running many other containers and services, which limits the amount of free RAM I have available.
This hardware constraint becomes relevant when choosing a local LLM. We’ll note this when discussed in the Aggregator section below.
Information
Next let’s discuss the information I’d like in my digest.
Weather
What’s the weather like today? Compare the weather today to yesterday’s. Is there any chance of precipitation? Any specific callouts like “it’s cold enough that it warrants running the oil furnace instead of the heat pumps”, etc…
Data source: A weather API. For this project I chose Open-Meteo, which is a free API.
Food
Remind myself what we planned for dinner tonight. See if tonight’s meal has any component still on the “grocery shopping list” which might have been forgotten, and is still needed.
Data source: Google Calendar. Google Keep
Tasks On My TODO list
Are there any items on my TODO list that I need to get to today? Or might have forgotten?
Data source: Google Tasks. Obsidian vault
Are there any important emails I might have missed? Any emails that I need to reply to?
Data source: Gmail
Tech News
A summary of tech news that happened that interests me. Lately this has been around LLMs, tooling, especially those that can be run locally. This combats FOMO, and the desire to scroll.
Data source: Reddit. Y Combinator.
Both are fetched without any authentication. Reddit exposes public RSS feeds at https://www.reddit.com/r/{subreddit}/top.rss, and Hacker News exposes its top stories via a public Firebase API at https://hacker-news.firebaseio.com/v0. Both are fetched using Python’s httpx.
Architecture
Scheduling Recurring Execution
Since we’re on MacOS, we have to use launchd to schedule our system to run. I’m currently scheduling this to run at 5:30 AM every day. This is an alternative approach to a more popular cron.
Aggregator
This is where the fun begins. We take all of the data we’ve collected, and treat it plus a prompt as the context for an LLM to process. The LLM is running locally, for privacy reasons.
┌─────────────────────┐ ┌─────────────────────┐
│ Fetched Context │ │ Prompt │
│ (weather, tasks, │ │ (instructions for │
│ email, news, …) │ │ how to summarize) │
└──────────┬──────────┘ └──────────┬──────────┘
│ │
└────────────┬────────────┘
│
▼
┌────────────────────────┐
│ Local LLM │
└────────────┬───────────┘
│
▼
┌────────────────────────┐
│ Digest (markdown) │
└────────────────────────┘
Recall above that I stated we’ll be running this system on MacOS. To run our LLM, I’ve been experimenting with LM Studio and Omlx, which is supposed to be faster on MacOS hardware for LLMs of the MLX architecture.
Currently, I’m using google/gemma-4-26b-a4b. While not perfect, it’s working well enough.
Storage
Every digest run is recorded to a local SQLite database. This gives me an audit log and lets me look back at past digests. Each row captures the profile, model, hostname, how many items were fetched from each source, the full summary, and a UUID for the run.
CREATE TABLE digestrun (
id INTEGER NOT NULL PRIMARY KEY,
run_uuid VARCHAR NOT NULL,
profile VARCHAR NOT NULL,
status VARCHAR NOT NULL,
started_at DATETIME NOT NULL,
finished_at DATETIME,
duration_seconds FLOAT,
model VARCHAR NOT NULL,
hostname VARCHAR NOT NULL,
emails_fetched INTEGER NOT NULL,
events_fetched INTEGER NOT NULL,
tasks_fetched INTEGER NOT NULL,
feed_items_fetched INTEGER NOT NULL,
summary VARCHAR,
error VARCHAR
);
The run_uuid is a uuid4() generated at the start of each run — it’s also stamped in the footer of the email itself, so I can correlate a received digest back to its database record.
Diagram
┌─────────────────────────┐
│ launchd │
└────────────┬────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Data Sources │
│ │
│ ┌─────────┐ ┌──────────┐ ┌───────┐ ┌─────┐ ┌───────┐ │
│ │ Weather │ │ Calendar │ │ Tasks │ │Email│ │ Reddit│ │
│ │ API │ │ /Keep │ │/Vault │ │ │ │ /HN │ │
│ └────┬────┘ └────┬─────┘ └───┬───┘ └───┬─┘ └────┬──┘ │
└───────┼────────────┼────────────┼──────────┼─────────┼──────┘
│ │ │ │ │
└────────────┴────────────┴──────────┴─────────┘
│
▼
┌─────────────────────────┐
│ Local LLM │
│ (LM Studio / Omlx) │
└────────────┬────────────┘
│
│ Convert the markdown to HTML
│
▼
┌────────────────────────┐
│ Gmail │
│ Daily Digest Email │
└────────────────────────┘
Digest Delivery
This is the simplest part. The digest output is sent as an HTML email to my inbox using Python’s built-in smtplib, delivered via Gmail’s SMTP server with STARTTLS. The LLM’s markdown output is converted to HTML before sending, with a plain-text fallback included. We essentially email ourselves.
Personalization
The system was made “profile driven”, allowing me to add a second digest for my wife. She gets the same weather and calendar but with a different prompt, focused on shared events and upcoming commitments. For example, no email, tasks, or tech news…
Her automation runs 5 minutes after mine to avoid concurrent Google API calls, and delivers to her inbox instead of mine.
Next Steps
Overall, I’ve been running this system for about a week, and I have to be honest, I love it. I’m pretty excited to read the digest over a cup of coffee and it has already caught an email that slipped through, and a task I forgot to do.
I’m excited to see if more powerful LLMs become accessible to even smaller footprints of RAM, improving this system.
Additionally, I might investigate breaking the single inference step into multiple steps, each with its own specialized prompt, to see if that produces a better final output. Since our local model is not the most powerful, giving it many smaller focused tasks may yield better results than one large one. This would increase overall runtime, but for offline inference that’s not really a concern.