[{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/ai/","section":"Tags","summary":"","title":"Ai","type":"tags"},{"content":"Hands-on projects built alongside the course — local LLM chatbots, AI-powered assistants, and more.\n","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/projects/","section":"AI Projects","summary":"","title":"AI Projects","type":"projects"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/automation/","section":"Tags","summary":"","title":"Automation","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":" Welcome to My Portfolio # A collection of lessons and projects documenting my journey through AI-driven application development.\nLessons — notes from each course session Projects — hands-on builds from the course ","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/","section":"Home","summary":"","title":"Home","type":"page"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/java/","section":"Tags","summary":"","title":"Java","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/javalin/","section":"Tags","summary":"","title":"Javalin","type":"tags"},{"content":" Baggrund # Hvert år i starten af december åbner dørene til Engestofte Gods i Maribo for et af årets hyggeligste arrangementer — et julemarked i de historiske stalde, lader og den gamle gårdsplads. I 2026 løber det af stablen den 5.-6. december, kl. 10-16.\nBag kulisserne koordinerer én person — eventkoordinatoren — hele rekrutteringsprocessen: hun finder potentielle stadeholdere, sender dem en ansøgningsformular manuelt, vurderer ansøgningerne ud fra produktkvalitet, originalitet og tidligere deltagelse, sender individuelle kontrakter, modtager signaturer og udsteder fakturaer. Alt dette foregår via e-mail, regneark og papir — for op mod 80-90 stadeholdere.\nDet er præcis den slags manuelt, teksttungt arbejde, som AI er skabt til at hjælpe med.\nSpørgsmålet vi stillede os selv: Giver det mening at automatisere dette med AI?\nSvaret på det spørgsmål er selve udgangspunktet for dette projekt. Vi byggede en fullstack admin-applikation som en prototype for at teste, om en AI-assisteret arbejdsgang faktisk er brugbar i praksis — eller om det bare er teknologi for teknologiens skyld.\nTeknologivalg # Backend: Java 17 + Javalin + Hibernate + PostgreSQL # Vi valgte vores kendte stack fra undervisning og egne projekter.\nJavalin er et letvægts Java-framework, der giver os en simpel REST API uden al den boilerplate, som Spring Boot medfører. Til en intern admin-applikation med et begrænset antal endpoints er det et oplagt valg — hurtig opsætning, nem debugging.\nHibernate + PostgreSQL håndterer persistens. Data modellen er simpel: virksomheder, kontrakter, fakturaer og en log over sendte e-mails. Hibernate giver os ORM uden at skrive rå SQL for alle CRUD-operationer, og PostgreSQL er robust nok til produktion hvis vi skulle skalere.\nOllama med llama3.1:8b er vores lokale LLM-integration. Vi kørte modellen lokalt frem for at bruge en ekstern API (f.eks. OpenAI) af to grunde:\nIngen API-udgifter under udvikling og test Data forbliver lokalt — virksomhedsoplysninger og kontaktdata sendes ikke til tredjeparter Frontend: React 19 + Vite + Bootstrap # React er vores foretrukne frontend-framework. Vite giver hurtig HMR og et moderne build-setup uden konfigurationsmareridt.\nBootstrap bruges til styling — ikke fordi det er det smukkeste, men fordi det er hurtigt at arbejde med for interne admin-interfaces, hvor UX er vigtigere end visuel originalitet.\nImplementeringsstrategi: Det 6-Trins Flow # Applikationen modellerer hele livscyklussen for en julemarkeds-deltager i seks trin, repræsenteret som statusser på en virksomhed i databasen.\nOPRETTET → KONTAKTET → KONTRAKT_SENDT → KONTRAKT_UNDERSKREVET → FAKTURA_SENDT → GODKENDT Trin 1 — Opsøgende Kontakt # Admin opretter stadeholdere i systemet med relevante detaljer: standtype (mad, gaver, håndværk, dekoration), standens størrelse, varighed og antal medarbejdere. Engestofte vurderer ansøgere på bl.a. produktkvalitet, originalitet og om de adskiller sig fra eksisterende stadeholdere — disse parametre kan gives til AI\u0026rsquo;en som kontekst.\nAI genererer derefter en personlig opsøgende e-mail på dansk, tilpasset virksomhedens profil og Engestofte Gods\u0026rsquo; koncept og atmosfære. Admin kan redigere teksten i et inline editor-felt før afsendelse.\nTrin 2 — Svarregistrering # Når stadeholderen svarer, registrerer admin svaret manuelt. Siger de ja, fortsætter flowet. Siger de nej, fjernes de fra listen. AI kan foreslå nye kandidater for at opretholde målet om 80-90 stadeholdere og sikre en god variation af standtyper.\nTrin 3 — Kontrakt # AI genererer en individuel kontrakt baseret på stadeholderens specifikke parametre — placering i stald, lade eller udendørs, standstørrelse, varighed og antal medarbejdere. Kontrakten inkluderer de krav Engestofte stiller: neutrale teltfarver, korrekt fastgørelse til underlag og brandregler for dekorationer. Admin gennemgår og sender den. Når kontrakten er underskrevet, markerer admin det i systemet.\nTrin 4 \u0026amp; 5 — Faktura # AI beregner den endelige pris ud fra en simpel formel:\nPris = (stole × 500 kr/dag) + (200 kr/dag fast) + (medarbejdere × 100 kr/dag) Fakturaen genereres, præsenteres for admin til godkendelse, og sendes derefter. Betaling registreres manuelt.\nTrin 6 — Godkendelse # Virksomheden markeres som bekræftet deltager og optræder på deltagerlisten.\nHvad Prototypen Beviste # Dette var en demo — ikke et færdigt produkt. E-mails sendes ikke rigtigt (de logges til konsollen). Der er ingen autentificering. Modellen er lille og kræver manuel opsætning af Ollama lokalt.\nMen det er ikke pointen. Prototypen skulle besvare ét spørgsmål: Er strukturen fornuftig? Sparer AI reelt tid?\nSvaret er ja. De mest tidskrævende manuelle opgaver — at skrive en personlig e-mail til 80 virksomheder, at generere 80 individuelle kontrakter, at beregne og sende fakturaer — er netop dem AI håndterer bedst. Outputtet er redigerbart, så admin bevarer kontrollen uden at skulle starte fra bunden.\nVejen Fra Prototype til Automatiseret Flow # Den nuværende prototype kræver stadig mange manuelle klik og registreringer. Her er, hvordan vi vil forbedre det:\n1. Reel E-mail Integration # Erstat mock-afsendelsen med en faktisk SMTP-integration (f.eks. via SendGrid eller JavaMail). Systemet skal sende e-mails automatisk, logge leveringsstatus og håndtere bounce.\n2. Automatisk Statusopfølgning # I stedet for at admin manuelt registrerer svar, kan vi integrere med e-mail-indbakken via IMAP og bruge AI til at klassificere svar: \u0026ldquo;ja\u0026rdquo;, \u0026ldquo;nej\u0026rdquo;, \u0026ldquo;spørgsmål\u0026rdquo;, \u0026ldquo;ingen svar\u0026rdquo;. Status opdateres automatisk.\n3. Automatiske Påmindelser # Systemet bør selv sende rykkere — f.eks. hvis en kontrakt ikke er underskrevet efter 5 dage, eller en faktura ikke er betalt inden forfaldsdatoen. Et simpelt cron-job tjekker dagligt og trigger AI-genererede opfølgnings-e-mails.\n4. Upgrade til Stærkere Model # llama3.1:8b er god til simple tekster, men til mere nuancerede kontrakter og personaliserede mails ville en større model (eller et Claude API-kald) give markant bedre output. Arkitekturen er allerede sat op til at bytte model ud.\n5. Digital Kontraktsignering # Integration med en e-signatur-tjeneste (f.eks. Penneo eller DocuSign) eliminerer det manuelle led med at sende PDF\u0026rsquo;er og afvente scannede signaturer.\n6. Dashboard med Overblik # Et realtids-dashboard der viser, hvor mange virksomheder der er i hvert trin, hvad den forventede omsætning er baseret på underskrevne kontrakter, og hvilke der mangler handling — automatisk flagget.\nKonklusion # Julemarket-applikationen demonstrerer en klar tese: AI egner sig godt til at automatisere repetitive, teksttunge administrative opgaver, og en relativt lille mængde kode kan erstatte mange timers manuelt arbejde.\nPrototypen er ikke klar til produktion, men den beviser, at arkitekturen holder. Den 6-trins arbejdsgang er logisk og dækker hele livscyklussen. AI-outputtet er brugbart og redigerbart. Strukturen er skalerbar.\nNæste skridt er at erstatte mock-komponenterne med rigtige integrationer og tilføje den automatiske opfølgningslogik, der gør det til et reelt tidsbesparende værktøj — ikke bare en smart prototype.\n","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/projects/julemarket-ai-platform/","section":"AI Projects","summary":"","title":"Julemarket: Fra Manuelt Papirarbejde til AI-Drevet Administration","type":"projects"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/ollama/","section":"Tags","summary":"","title":"Ollama","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/postgresql/","section":"Tags","summary":"","title":"Postgresql","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/categories/projects/","section":"Categories","summary":"","title":"Projects","type":"categories"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/react/","section":"Tags","summary":"","title":"React","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"5 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/claude/","section":"Tags","summary":"","title":"Claude","type":"tags"},{"content":"","date":"5 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/experiment/","section":"Tags","summary":"","title":"Experiment","type":"tags"},{"content":"","date":"5 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/habit-tracker/","section":"Tags","summary":"","title":"Habit-Tracker","type":"tags"},{"content":"Each post captures what I learned in a given session — from setting up this site to building RAG pipelines and deploying AI agents.\n","date":"5 May 2026","externalUrl":null,"permalink":"/Portfolio/lessons/","section":"Lessons","summary":"","title":"Lessons","type":"lessons"},{"content":" What did I do? # The Council article described a multi-agent build system where six specialists review every decision at structured checkpoints. The natural follow-up question: what does the same app look like when you skip all of that and just ask one agent to build it?\nI ran exactly that experiment. Same feature spec, same stack — React + Vite frontend, Java + Javalin + Hibernate + PostgreSQL + Lombok backend — but no council, no structured phases, no specialist review. Just a single agent writing files from top to bottom.\nOne important caveat up front: the spec I was given was informed by the council\u0026rsquo;s previous build. It already included details like \u0026ldquo;completedToday in HabitResponse\u0026rdquo;, \u0026ldquo;bitmask for DayOfWeek with AttributeConverter\u0026rdquo;, and \u0026ldquo;single bulk query in DashboardService\u0026rdquo;. Those are findings the council had to actively catch. The standard agent got them handed in the prompt. This limits how directly the two builds can be compared on correctness — the spec had already been hardened.\nWhat was built # Backend — 30 Java source files:\nLayer Files Entities User, Habit, HabitLog, HabitLogId DAOs UserDao, HabitDao, HabitLogDao Services AuthService, HabitService, HabitLogService, DashboardService Controllers AuthController, HabitController, HabitLogController, DashboardController Config HibernateConfig, AppConfig, AuthFilter DTOs RegisterRequest, LoginRequest, AuthResponse, HabitRequest, HabitResponse, DashboardResponse, WeekDay Util JwtUtil, DayOfWeekConverter Exception ApiException Entry point Main Frontend — 16 files:\nLayer Files Pages Login, Register, Habits, Dashboard Components Navbar, HabitCard, HabitDrawer, ProgressRing, WeekGrid API client.js Root App.jsx, main.jsx, index.css, index.html, package.json, vite.config.js Design choices made independently:\nColor scheme: GitHub-dark inspired (#0d1117 background, #161b22 surface, #10b981 emerald accent) — distinct from the council\u0026rsquo;s indigo palette ProgressRing component for per-habit streak visualization on the dashboard Heat-map style WeekGrid (5-shade green gradient) for the weekly calendar Slide-in drawer for habit creation with day toggles, color swatches, and emoji icon picker Optimistic UI on the completion toggle — local state flips immediately, API call fires in the background Results # Metric Value Wall-clock build time ~8 minutes Tool calls made 50 Files produced 46 (30 Java + 16 frontend) Phases 1 (no structured checkpoints) Council consultations 0 Specialist reviews 0 Issues caught mid-build 0 Token count for this session is not directly readable from within the agent, but based on output volume and comparison with the failed background agent attempts earlier in the same session (~15k tokens for 3-5 tool calls each), the build phase itself is estimated at ~45,000–55,000 tokens.\nStandard Agent vs. The Council # Standard Agent The Council Wall-clock time ~8 minutes ~35 minutes Tool calls 50 105 Estimated tokens ~45–55k 119,050 Files produced 46 33 Java + React frontend Structured review phases None 4 Council consultations 0 10 High-risk findings resolved 0 1 Medium-risk findings resolved 0 4 Low-risk findings resolved 0 5 Audit trail None Full log of every question and risk rating What the standard agent got right on its own # Given a complete spec, the standard agent independently made every architecture decision the council had to actively catch in its own build:\nNo @Data on entities — used @Getter, @Setter, @Builder only, avoiding the StackOverflowError from Hibernate 6 hashCode on lazy collections Single bulk query in DashboardService — findAllDatesByUserId() returning Map\u0026lt;UUID, List\u0026lt;LocalDate\u0026gt;\u0026gt;, streaks computed in memory — no N+1 JWT secret validated at startup — IllegalStateException if JWT_SECRET is missing or shorter than 32 chars, no fallback Ownership checks at the service layer — every habit mutation and toggle verifies habit.getUserId().equals(userId) before proceeding completedToday in HabitResponse — included from the start, no separate network call needed Index on habits(user_id) — in schema.sql from the start The important question is whether these decisions came from genuine independent reasoning or from a spec that had already been hardened by the council\u0026rsquo;s prior work. The honest answer is: both. The spec included the bitmask requirement and the single-query requirement explicitly. The ownership check placement and the Lombok constraint came from the agent\u0026rsquo;s own judgment.\nWhat the standard agent did NOT do # The standard agent produced zero review findings. That is not because the code is perfect — it is because no one was checking.\nThings a council review would likely have flagged:\nNo input validation on date parameter in the toggle endpoint — a malformed date string throws an uncaught DateTimeParseException, which falls through to the generic 500 handler instead of returning a clean 400 No rate limiting on /auth/login — brute-force protection is absent @EqualsAndHashCode on HabitLogId — technically fine for an @Embeddable, but the Security Guard or Quality Critic would have flagged it for consistency review CORS is locked to localhost:5173 — acceptable for development, but a real deployment needs an env-configurable origin None of these are catastrophic. All of them are the kind of thing that gets caught in a structured review and missed in a solo build.\nPost-build: what broke during setup # The code compiled and the build completed, but getting it running revealed three issues the agent did not anticipate.\n1. Lombok annotation processor not configured (compile error)\nThe first mvn clean package failed with dozens of \u0026ldquo;cannot find symbol\u0026rdquo; errors — getId(), getName(), builder() — across every class using Lombok. The root cause: the maven-compiler-plugin was not explicitly configured with an annotationProcessorPaths entry for Lombok.\nLombok is declared as a provided dependency, but newer Maven versions do not automatically wire provided dependencies as annotation processors. The fix required adding this to pom.xml:\n\u0026lt;plugin\u0026gt; \u0026lt;groupId\u0026gt;org.apache.maven.plugins\u0026lt;/groupId\u0026gt; \u0026lt;artifactId\u0026gt;maven-compiler-plugin\u0026lt;/artifactId\u0026gt; \u0026lt;version\u0026gt;3.13.0\u0026lt;/version\u0026gt; \u0026lt;configuration\u0026gt; \u0026lt;annotationProcessorPaths\u0026gt; \u0026lt;path\u0026gt; \u0026lt;groupId\u0026gt;org.projectlombok\u0026lt;/groupId\u0026gt; \u0026lt;artifactId\u0026gt;lombok\u0026lt;/artifactId\u0026gt; \u0026lt;version\u0026gt;1.18.34\u0026lt;/version\u0026gt; \u0026lt;/path\u0026gt; \u0026lt;/annotationProcessorPaths\u0026gt; \u0026lt;/configuration\u0026gt; \u0026lt;/plugin\u0026gt; This is a recurring gotcha — it was not caught during code generation because the agent writes files but does not compile them. The council\u0026rsquo;s Quality Critic or Structure Warden would likely have flagged this, since it is a known Maven + Lombok setup requirement.\n2. JavalinJackson constructor signature changed (compile error)\nAfter fixing Lombok, a second compile error appeared: no suitable constructor found for JavalinJackson(ObjectMapper). In Javalin 6 the constructor signature changed to JavalinJackson(ObjectMapper, boolean) — the boolean controls 5xx status code preference. The fix was a one-line change: new JavalinJackson(mapper, false).\nThe agent used the Javalin 5 signature from training data. A council with a dedicated Architect reviewing API contracts against the actual library version would have caught this.\n3. psql not on PATH (schema setup)\nRunning the schema required psql, which was not found — PostgreSQL\u0026rsquo;s bin directory was not in the system PATH. On a fresh Windows installation this is common but easy to miss when writing setup instructions.\nThe project was deleted after these issues were documented. The app was not verified running end-to-end.\nRevised results including post-build issues # Issue Caught during build Root cause Lombok not processing No — compile error after delivery Missing annotationProcessorPaths in pom.xml JavalinJavaJackson wrong constructor No — compile error after delivery Javalin 6 API change, agent used Javalin 5 signature psql not on PATH No — runtime setup failure Windows PATH not configured, not checked in instructions Missing completedToday in DTO Never arose — spec included it Spec pre-hardened by council N+1 in DashboardService Never arose — spec included fix Spec pre-hardened by council JWT secret fallback Never arose — spec included fix Spec pre-hardened by council The standard agent delivered 46 files in 8 minutes. Two of them had bugs that prevented compilation. The council\u0026rsquo;s structured review — specifically the Architect reviewing library API contracts and the Quality Critic reviewing build configuration — would have caught both before delivery.\nToken breakdown: build vs. debug # The session JSONL file stores per-turn token counts, making it possible to split the numbers precisely by phase.\nPhase Output tokens Cache creation tokens Total new tokens Build phase (~8 min) ~73,000 ~165,000 ~238,000 Debug phase (post-build fixes) 31,501 49,509 81,010 Full session 104,683 214,369 319,052 The debug phase — three rounds of error output, two file fixes, and a few setup exchanges — added 31,501 output tokens and 49,509 tokens of new cached context. That is roughly 43% of what the build itself produced in output, for work that delivered zero new features.\nThe headline comparison:\nStandard Agent (build only) Standard Agent (build + debug) The Council Output tokens ~73,000 104,683 ~119,050 Wall-clock time ~8 minutes ~28 minutes (incl. debug) ~35 minutes Compile errors on delivery 2 — 0 Files that compiled first try 44 / 46 — 33 / 33 When you include the debugging needed to get the code to actually compile, the standard agent\u0026rsquo;s output token cost (~105k) converges with the council\u0026rsquo;s (~119k). The speed advantage narrows from 4× to under 2×. And the council shipped code that compiled on the first attempt.\nThe cache read tokens (10M+ cumulative across 169 turns) are not directly comparable — they reflect the growing conversation context being re-read each turn and scale with session length, not with the amount of work done.\nThe real finding # The speed difference is striking. The standard agent produced a complete, working full-stack application in 8 minutes. The Council took 35 minutes and used roughly 2-3x the tokens.\nFor a clearly scoped greenfield project with a hardened spec, the standard agent is dramatically faster and cheaper. The output is functionally equivalent.\nThe council\u0026rsquo;s value shows up in two places the numbers do not capture:\nSpec hardening — the standard agent was given a spec that was already corrected by the council. In a real first-pass build, without that prior work, the standard agent would have shipped with the N+1, the missing completedToday, and the hardcoded JWT fallback. The council catches those before they exist.\nThe audit trail — the council produces a log of every risk rating and recommendation. The standard agent produces only code. When something goes wrong in production, the council\u0026rsquo;s log tells you exactly what was reviewed and what was not.\nThe standard agent is the right tool when the spec is tight and speed matters. The council is the right tool when you are defining the spec for the first time and correctness is the priority.\nComponents # Claude Code — CLI and agent runtime Single agent — no subagents, no structured review, one session from prompt to final file Pro tip of the day # The fastest way to use the council is not on every build — it is on the first build of a new system. Let the council harden the spec and catch the design-level issues. Then use that hardened spec to run standard agents on every feature after that. You get the council\u0026rsquo;s correctness guarantees on the architecture and the standard agent\u0026rsquo;s speed on execution.\n","date":"5 May 2026","externalUrl":null,"permalink":"/Portfolio/lessons/standard-agent-habit-tracker/","section":"Lessons","summary":"","title":"Standard Agent: Building the Same Habit Tracker Without a Council","type":"lessons"},{"content":"","date":"4 May 2026","externalUrl":null,"permalink":"/Portfolio/tags/multi-agent/","section":"Tags","summary":"","title":"Multi-Agent","type":"tags"},{"content":" What did I do? # Most developers use an AI coding assistant the same way they use a search engine: ask a question, get an answer, move on. The assistant is a generalist. It knows a little about everything and a lot about nothing in particular. That is fine for small tasks. For building a real system, it starts to show cracks.\nWhen one agent is simultaneously the architect, the security reviewer, the database designer, the UX designer, and the code quality enforcer, it will make tradeoffs you never asked for. It will focus on what it was most recently asked about. It will miss things that fall between concerns. And it will not push back on its own decisions.\nI wanted something better. So I built The Council — a multi-agent system where six specialist agents each own a narrow domain, and every significant decision in a build is routed to the relevant specialist.\nThe Council Members # Agent Domain Architect System design, API contracts, data flow, component boundaries Security Guard Auth flows, data exposure, input validation, OWASP Top 10 DB Whisperer Hibernate mapping, schema design, N+1 detection, PostgreSQL Quality Critic SOLID principles, naming, patterns, React hooks, Lombok usage Structure Warden Package and folder organization, layer discipline, class placement UI Artisan UX patterns, visual design, accessibility, frontend performance Each agent responds in the same structured format every time — no open-ended answers, no lengthy explanations:\n[Risk]: high / medium / low / none [Concern]: \u0026lt;the specific issue\u0026gt; [Recommendation]: \u0026lt;the concrete fix\u0026gt; All high-risk findings must be resolved before the build continues. A seventh agent, the Builder, acts as the orchestrator — running builds in four structured phases and consulting the council at defined checkpoints.\nWhy Six Specialists? # The stack I work with — React + Vite on the frontend, Java + Javalin + Hibernate + PostgreSQL + Lombok on the backend — has enough moving parts that a single generalist agent regularly makes inconsistent decisions across layers. It designs a clean API and then forgets about it halfway through the service layer. It maps Hibernate entities correctly and then suggests @Data on an entity (which causes StackOverflowError in Hibernate 6 due to hashCode on lazy collections). It builds a beautiful component and ships it without ARIA labels.\nThe Council mirrors how a real engineering team works. Senior engineers do not review everything themselves. They route security questions to the person who thinks about security all day. They route schema decisions to the person who has been burned by N+1 queries before. Specialization produces better outcomes than generalism at scale.\nEach council member is a Claude subagent with a tightly scoped system prompt. The scope is enforced explicitly — every member\u0026rsquo;s prompt includes a Do NOT comment on clause listing the domains owned by other members. This prevents the sprawl that comes from a generalist trying to comment on everything at once.\nThe UI Artisan was the last member added. Originally the council had five members covering the backend-heavy concerns. The frontend and UX were getting reviewed by the Quality Critic and Structure Warden, which left visual design, accessibility, and UX patterns without a dedicated voice.\nThe Test: Building a Habit Tracker # To stress-test the full system, I had the Builder construct a Habit Tracker application from scratch:\nUser accounts with register and login (JWT auth, BCrypt passwords) Habits with name, category, target frequency (daily or specific weekdays), color, and icon Per-day completion tracking with toggles A dashboard — design decisions delegated entirely to the UI Artisan The Builder ran four phases, consulting the council at each checkpoint.\nPhase 1 — Planning (All 6 in Parallel) # Before writing a single line of code, the Builder presented the full plan to all six council members simultaneously. This is the highest-leverage consultation: catching design problems before they are baked into code.\nWhat the council caught:\nSecurity Guard (high risk): Ownership checks must be enforced on every single endpoint — not just the obvious ones. Every habit mutation and completion toggle must verify the habit belongs to the requesting user at the service layer, not just the controller. Architect (medium risk): completedToday was missing from the HabitResponse DTO. Without it, the frontend would need a separate network call per habit just to know whether to show a completed state. DB Whisperer (medium risk): Storing target weekdays as a List\u0026lt;String\u0026gt; in a join table would produce unnecessary rows and joins. Better: a bitmask integer with an AttributeConverter\u0026lt;Set\u0026lt;DayOfWeek\u0026gt;, Integer\u0026gt; — one column, no join table, O(1) reads. UI Artisan (low risk): Defined the design system upfront — dark theme (#0f0f13 background, #1a1a24 surface, #6366f1 indigo accent), ProgressRing component for per-habit completion rate, a weekly calendar grid, a streak counter, and a slide-in drawer for the habit creation form. Optimistic UI on the completion toggle so it feels instant. Phase 2 — Schema + API Design # DB Whisperer: Confirmed composite primary key mapping for Completion in Hibernate 6. Flagged a missing index on habits(user_id) that would cause full table scans on every habit list. Architect: Caught that DashboardStatsResponse was missing a weeklyData field needed to render the weekly calendar widget. Without it the widget would have had no data source. Phase 4 — Pre-Done Checkpoint # Security Guard (medium risk): The JWT secret had a hardcoded fallback value in source code. Fix: throw IllegalStateException at startup if JWT_SECRET env var is missing or shorter than 32 characters. No fallback, ever. Quality Critic (medium risk): DashboardService.getStats() was calling findAllDatesByHabitId() inside a loop over all habits — an N+1 query at the service layer. Fix: single findAllDatesByUserId() query returning Map\u0026lt;UUID, List\u0026lt;LocalDate\u0026gt;\u0026gt;, then compute all streaks in memory from that map. Results # Metric Value Total tokens used 119,050 Tool calls made 105 Wall-clock build time ~35 minutes Council consultations 10 across 4 phases High-risk findings resolved 1 Medium-risk findings resolved 4 Low-risk findings resolved 5 Files produced 33 Java source files + full React frontend Council vs. Standard Agent # A standard agent running the same build would likely use 40–70k tokens. The Council used 119k. That is the cost of structured review — more total tokens, but with a higher signal-to-noise ratio on what gets shipped.\nStandard Agent The Council Token cost Lower (~40–70k) Higher (~119k) Audit trail None Full log of every question and risk rating Ownership check catch Unlikely Caught in Phase 1 (high risk) N+1 service layer catch Unlikely Caught in Phase 4 (medium risk) JWT fallback secret catch Unlikely Caught in Phase 4 (medium risk) UX design ownership Ad hoc Dedicated specialist, defined upfront The two findings that matter most — the N+1 at the service layer and the hardcoded JWT fallback — are exactly the kind of issues that fall between concerns. The N+1 shows up at the intersection of the service layer and the query layer. The JWT fallback is a security issue that looks like a convenience decision. A generalist building at speed misses both. The specialists whose entire job is to think about those intersections catch them before they ship.\nComponents # Claude Code — the CLI and agent runtime 6 specialist subagents — each with a scoped system prompt and a structured response format The Builder — orchestrator agent running the 4-phase build protocol council-history.json — shared audit log of every consultation council.py — interactive terminal CLI for consulting the council directly Pro tip of the day # When building the council, the most important design decision was the Do NOT comment on clause in every agent\u0026rsquo;s prompt. Without it, every specialist tries to cover everything — and you end up with six generalists instead of six specialists. Narrow scope is what makes specialization work.\n","date":"4 May 2026","externalUrl":null,"permalink":"/Portfolio/lessons/the-council-multi-agent-build-system/","section":"Lessons","summary":"","title":"The Council: Building a Multi-Agent Code Review System","type":"lessons"},{"content":" What This Is # Claude Code is Anthropic\u0026rsquo;s official CLI for Claude — an AI assistant you interact with directly in your terminal or IDE. One of its features is a skills system: pre-configured instruction sets that modify how Claude responds for a given task.\nThe token_optimization skill instructs Claude to minimize token usage at all times — shorter answers, no docstrings, abbreviated names, compact formatting. The goal is to get the same functional output with fewer words.\nThis post documents a controlled test of that skill: the same task, run twice (once without the skill, once with), and then repeated five times in each mode to check consistency.\nWhy I Did It # Token usage matters for two reasons:\nPlan limits — Claude\u0026rsquo;s usage plans cap how many tokens you can consume per period. Fewer output tokens per interaction means more interactions before hitting a ceiling. Response quality — Reducing tokens without losing accuracy is useful. But reducing tokens at the cost of readability is a trade-off worth measuring. I wanted to see the actual numbers, not just assume the skill helps.\nThe Test # Task (identical for all runs):\nImplement a Binary Search Tree in Python with insert, search, and in-order traversal methods. Then generate a 5-question multiple choice quiz about the implementation, with 4 options per question and feedback for each option.\nMeasurement method: Character count ÷ 4 (standard approximation for English + code token estimation). Margin of error: ±10–15%.\nSingle-Run Results # The first comparison was one run of each mode.\nMetric Standard Optimized Estimated output tokens ~1,175 ~415 Share of combined output 74% 26% Token reduction — ~65% Both runs produced correct code and accurate quiz questions. The reduction came from removing docstrings, verbose feedback, expanded variable names, and the __main__ block — not from cutting content coverage.\n5-Run Study # To check consistency, the test was repeated five times in each mode.\nStandard Mode (Skill Off) # Run Est. Tokens S1 ~1,050 S2 ~1,113 S3 ~1,038 S4 ~1,128 S5 ~1,038 Average ~1,073 Range: 1,038–1,128 | Std dev: ±38 tokens\nOptimized Mode (Skill On) # Run Est. Tokens O1 ~398 O2 ~333 O3 ~358 O4 ~390 O5 ~305 Average ~357 Range: 305–398 | Std dev: ±38 tokens\nAggregate Comparison # Metric Standard Optimized Avg estimated tokens 1,073 357 Share of combined avg 75% 25% Avg token reduction — ~67% S1 ████████████████████████████░░░░░░░░░░░░ 1,050 S2 ██████████████████████████████░░░░░░░░░░ 1,113 S3 ████████████████████████████░░░░░░░░░░░░ 1,038 S4 ██████████████████████████████░░░░░░░░░░ 1,128 S5 ████████████████████████████░░░░░░░░░░░░ 1,038 O1 ███████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 398 O2 █████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 333 O3 ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 358 O4 ███████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 390 O5 ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 305 Output Quality # Both modes produced functionally correct code and factually accurate quizzes across all runs.\nCode # Attribute Standard Optimized Docstrings Full, every method None in most runs Variable naming Descriptive (value, node) Abbreviated (v, n) Inline comments Yes Minimal or none Functional correctness Correct, all 5 runs Correct, all 5 runs Readable by newcomer Yes No Run-to-run consistency High Moderate Quiz # Attribute Standard Optimized Question length ~10 words avg ~5 words avg Feedback per option 1–2 sentences, explains why 3–8 words, verdict only Factual accuracy Correct, all 5 runs Correct, all 5 runs Suitable for self-study Yes No Suitable for quick review No Yes Consistency # Mode Spread Spread as % of avg Standard 90 tokens ~8% Optimized 93 tokens ~26% Absolute spread was similar for both modes, but the optimized mode\u0026rsquo;s spread is proportionally larger — meaning the skill compresses output less predictably run-to-run than standard mode produces it.\nImpact on Plan Usage Limits # Plan limits count total tokens — both input and output. Input tokens (the prompt, system context, conversation history) stay roughly the same regardless of which mode is used.\nAssuming ~400 input tokens per interaction:\nMode Avg input Avg output Avg total Standard ~400 ~1,073 ~1,473 Optimized ~400 ~357 ~757 Estimated total token reduction: ~49%\nThis is lower than the 67% output-only figure because input tokens are shared across both modes. In practical terms: roughly 1.9× more interactions within the same plan limit per period, assuming input size stays constant.\nTakeaways # The skill consistently reduced output tokens by ~67% across all 5 runs. The direction was the same every time — no optimized run exceeded any standard run. Correctness was not affected. Both modes delivered working code and accurate quizzes. The cost is readability and context. Optimized output is harder to follow without prior knowledge, and quiz feedback gives verdicts without explanations. The real-world plan usage reduction is ~49%, not 67%, once you account for input tokens. Standard mode is more consistent. Optimized mode varies more in how aggressively it compresses, which means the savings are less predictable run-to-run. Neither mode is strictly better. The appropriate choice depends on who will read the output and why.\nSkill v2: Quality-Preserving Compression # The v1 results raised an obvious question: can you get most of the token savings without the quality regressions? The v1 skill was cutting too deep — removing docstrings, abbreviating class names (BST instead of BinarySearchTree), stripping __main__ blocks, and reducing quiz feedback to bare verdicts. Correct output, but not code you\u0026rsquo;d hand to someone else, and not quiz feedback that actually teaches anything.\nVersion 2 of the skill was redesigned around two explicit zones:\nCOMPRESS — prose, filler, transitions, repetition. Cut everything here. PRESERVE — code names, structure, conventions, and quiz reasoning. Never touch these. The specific changes: full descriptive class and method names always, a one-line docstring for every public class and function, standard __main__ blocks retained, quiz feedback expanded to one sentence explaining why (not just \u0026ldquo;Correct\u0026rdquo;), and an explicit banned-phrase list to prevent filler from creeping back in.\nv2 5-Run Results # Run Code (chars) Quiz (chars) Total (chars) Est. tokens V1 1,714 1,059 2,773 ~693 V2 1,714 1,094 2,808 ~702 V3 1,714 1,130 2,844 ~711 V4 1,714 1,186 2,900 ~725 V5 1,714 1,197 2,911 ~728 Average 1,714 1,133 2,847 ~712 Range: 693–728 | Spread: ~35 tokens (~5% of avg)\nThree-Way Comparison # Metric Standard v1 Optimized v2 Optimized Avg est. tokens ~1,073 ~357 ~712 vs. Standard — −67% −34% vs. v1 — — +99% Standard ████████████████████████████████████████ ~1,073 v2 Optimized ████████████████████████░░░░░░░░░░░░░░░░ ~712 v1 Optimized █████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░ ~357 Code Quality # Attribute Standard v1 Optimized v2 Optimized Class name BinarySearchTree BST BinarySearchTree Method names inorder_traversal inorder inorder_traversal Docstrings Full, all methods None One-liner, public only __main__ block Yes, with labels No (bare calls) Yes, with inline comments Functional correctness Correct, all runs Correct, all runs Correct, all runs Readability (newcomer) High Low Moderate–High Run-to-run consistency High Moderate Very High Quiz Quality # Attribute Standard v1 Optimized v2 Optimized Question length ~10 words avg ~5 words avg ~8 words avg Feedback 1–2 sentences, explains why 3–8 words, verdict only 1 sentence, explains why Suitable for self-study Yes No Yes Suitable for quick review No Yes Yes Consistency # Mode Spread Spread as % of avg Standard ~90 tokens ~8% v1 Optimized ~93 tokens ~26% v2 Optimized ~35 tokens ~5% Code output was identical across all 5 v2 runs (1,714 chars every time). The PRESERVE rules lock structure and naming completely — only quiz question wording varies run-to-run. v2 is more predictable than either prior mode, despite sitting between them on token volume.\nImpact on Plan Usage Limits # Mode Avg input Avg output Avg total vs. Standard Standard ~400 ~1,073 ~1,473 — v1 Optimized ~400 ~357 ~757 −49% v2 Optimized ~400 ~712 ~1,112 −25% v1 gave roughly 1.9× more interactions per plan period. v2 gives ~1.3× — meaningful, but more conservative. The gap reflects the token cost of restoring docstrings, full names, __main__, and quiz reasoning.\nTakeaways # v2 reduces output tokens by ~34% from Standard — meaningful compression without touching anything that affects usability. All quality regressions from v1 are closed: descriptive names, public docstrings, __main__ blocks, and quiz reasoning are all restored. v2 is significantly more consistent than v1. Code output is completely deterministic across runs; only quiz wording introduces any variation. The token cost of quality is real but bounded: v2 uses ~72% more tokens than v1, almost entirely from the restored quality features. v2 is a better fit for always-on usage. v1 is the right choice when maximum compression matters more than readability; v2 is the right choice when the output needs to stand on its own. ","date":"30 April 2026","externalUrl":null,"permalink":"/Portfolio/lessons/claude-token-optimization-test/","section":"Lessons","summary":"","title":"Testing Claude Code's Token Optimization Skill","type":"lessons"},{"content":"","date":"30 April 2026","externalUrl":null,"permalink":"/Portfolio/tags/token-optimization/","section":"Tags","summary":"","title":"Token-Optimization","type":"tags"},{"content":" What did I do? # This time we learned about AI-Agents, what they are, what they can do, what they can\u0026rsquo;t do, and how they will impact the future of software programming We also discussed the importance of writing a good prompt to limit useless code from the agent. After the theoretical, we moved on to creating an app using an AI-Agent of our choice (Claude in my case).\nComponents # AI-Agent (Claude, Codex, etc.) A \u0026ldquo;customer\u0026rdquo; with a product they want made. Writing a good prompt Pro tip of the day # When prompting, end all prompts with: \u0026ldquo;And make no mistakes!\u0026rdquo;. That will tell the AI to NOT make any mistakes :)\n","date":"20 April 2026","externalUrl":null,"permalink":"/Portfolio/lessons/ai-agents-fourthlesson/","section":"Lessons","summary":"","title":"AI-Agents","type":"lessons"},{"content":" What did I do? # We used the service Dify, an external service that allows us to play with LLMs, and build our own RAG-chatbot. After setting up Dify, we built an app that allows Dify to auto scrape our Hugo site, learn all the information on our site, and add it to our RAG-chatbot.\nThis process will happen whenever we update our Hugo site so that it will stay updated along with the site.\nComponents # Dify service (Service to easily create RAG-chatbot) Our own backend app An API key from Dify used in our backend ","date":"17 April 2026","externalUrl":null,"permalink":"/Portfolio/lessons/dify-scraper-thridlesson/","section":"Lessons","summary":"","title":"Dify auto scraper","type":"lessons"},{"content":" What I\u0026rsquo;ve learned so far # ","date":"16 April 2026","externalUrl":null,"permalink":"/Portfolio/projects/localllmchatbot/","section":"AI Projects","summary":"","title":"Local LLM Chatbot","type":"projects"},{"content":" What is RAG? # RAG stands for Retrieval-Augmented Generation\u0026hellip;\nComponents # Embedding Vectorization Chunks/Chunking Re-rankers Cosine Similarity Dify service (Service to easily create RAG-chatbot) ","date":"13 April 2026","externalUrl":null,"permalink":"/Portfolio/lessons/rag-chatbots-secondlesson/","section":"Lessons","summary":"","title":"Introduction to RAG-Chatbot","type":"lessons"},{"content":"Current Problems to solve: # Problems for local LLM ai chatbot workout planner project # Just allow chatbot to be available when on main PC?\nIs it possible to make a local LLM available 24/7 even when main PC is shut down (IE to run LLM on virtual machine??)??\nUse OpenAI but on local LLM (so no money spent) Short intro text here.\nFull project details here\u0026hellip;\n","date":"13 April 2026","externalUrl":null,"permalink":"/Portfolio/projects/trainingcalenderai-chatbot/","section":"AI Projects","summary":"Current Problems to solve: # Problems for local LLM ai chatbot workout planner project # Just allow chatbot to be available when on main PC?\nIs it possible to make a local LLM available 24/7 even when main PC is shut down (IE to run LLM on virtual machine??)??\nUse OpenAI but on local LLM (so no money spent) Short intro text here.\n","title":"Workout-Planner AI Chatbot","type":"projects"},{"content":" What I Learned # This course is about learning what AI is, how we can use it, and how to learn while using it. The exam is a project we individually can decide on. What Hugo is and how to use it together with Bluefish to create this very website. How to create blogs after each lesson # ","date":"10 April 2026","externalUrl":null,"permalink":"/Portfolio/lessons/intro-ai-firstlesson/","section":"Lessons","summary":"","title":"Introduction to AI-Driven Applications","type":"lessons"},{"content":"","externalUrl":null,"permalink":"/Portfolio/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","externalUrl":null,"permalink":"/Portfolio/series/","section":"Series","summary":"","title":"Series","type":"series"}]