Eirini Mantzouni

Projects

Scope

Stack

RegScope — Autonomous EU Regulatory Intelligence

Python AI/LLM

Autonomous AI agent that decodes EU regulatory text, extracts compliance timelines, analyses ambiguities, contradictions, stakeholder impacts, and cross-references, then delivers executive briefings. Three modes: single-regulation deep analysis, side-by-side document comparison (mapping transposition gaps, gold-plating, contradictions, and national additions), and corpus-wide Q&A with synthesised intelligence briefings.

Stack: Python · Gemini API · Cloud

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Backend	FastAPI, SSE streaming for real-time agent log, async pipeline generators
LLM	Gemini (analysis, query expansion, comparison) with structured JSON output
PDF processing	pymupdf extraction, article-level regex chunking, compress-then-merge strategy
EUR-Lex integration	SPARQL (EuroVoc subject search + citation graph + title keyword) via Cellar endpoint, REST for full text fetch by CELEX
Search intelligence	LLM-powered query expansion: natural language → EuroVoc terms + title keywords + CELEX candidates, all verified against EUR-Lex
Embeddings	sentence-transformers (BGE / legal-SBERT configurable), cosine similarity retrieval
RAG	Semantic search over saved chunk embeddings, context injection into LLM prompts, library-aware query enhancement
Compare	Embedding-assisted provision pre-matching (similarity matrix, greedy assignment), gap/gold-plating detection before LLM merge
Document storage	SQLite with swappable backend (MongoDB stub), fingerprint-based deduplication, auto-generated smart tags
Smart tags	Auto-extracted: domain, jurisdiction level, country, regulation type, topics, related CELEX, implements (parent regulation linkage)
Cross-references	CELEX extraction from LLM output + regex fallback, EUR-Lex verification, library status badges, one-click analyse-from-search
Frontend	Vanilla HTML/CSS/JS, dark “classified dossier” aesthetic, agent log with animated reasoning trace
Demo mode	3 pre-loaded regulations + 3 policy questions + 1 transposition comparison, fully offline

Architecture Decisions

Compress-then-merge chunking — each article chunk returns insights in the same structure as the final report (timeline entries, ambiguities, stakeholder impacts), limited to 5 items per category. The merge call receives ~3K chars instead of ~12K, producing higher-quality synthesis
LLM as query expander, not search engine — the LLM generates EuroVoc terms, title keywords, and CELEX candidates from natural language; EUR-Lex SPARQL does the actual searching with verified structured queries. Every CELEX in results is real
Embedding-assisted comparison — provisions from both documents are embedded, a similarity matrix identifies candidate matches, and gaps are detected structurally (unmatched provisions below threshold). The LLM then confirms and classifies pre-matched pairs instead of doing 15×8 mental comparisons
Library as growing knowledge base — every analysed document feeds the library with embedded chunks and smart tags. Compare pulls saved findings instead of re-processing. RAG injects prior analysis into new queries. The system improves with use
Swappable database layer — database.py abstracts SQLite vs MongoDB behind a common interface. Pipeline code never imports the implementation directly. Switching backends is one env var change
Three EUR-Lex search strategies — EuroVoc subject classification (catches regulations whose titles don’t match keywords), citation graph (finds what cites or is cited by a known regulation), and title keywords (fallback). Combined and deduplicated
Actionable cross-references — regulation names in analysis output are converted to CELEX numbers (LLM + regex fallback), verified against EUR-Lex, checked against library, and rendered with View/Compare/Analyse buttons. The citation network grows organically
Demo mode as portfolio showcase — pre-loaded with CFP, Green Deal, AI Act analysis plus a Greek transposition comparison. Agent log animates the full reasoning trace with timed delays. Works without API key, EUR-Lex, or database

EU FDI Explorer — Interactive Fisheries Analysis

R R Markdown ML/Clustering

Interactive analysis of 5 million catch records across 23 EU Member States and 12 years of STECF fisheries data (2013–2024). Evaluates the Landing Obligation’s impact on discards, detects fleet behavioural shifts using PCA and clustering, and delivers everything as a single self-contained HTML report with linked dashboards — no server required.

Five crosstalk-linked dashboards with Sankey diagrams, sunburst charts, drill-down visualisations, and searchable tables. Includes species-level discard analysis with EWG 25-10 exemption mapping and a complete ML pipeline (Hellinger transform → PCA → silhouette-optimised k-means → transition flows).

Stack: R · R Markdown · Plotly · Highcharter · Crosstalk · Reactable · echarts4r · ggiraph · cluster · vegan

Live Report GitHub

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Data pipeline	data.table + dplyr: raw CSVs → aggregated .rds files via prepare_data.R
Visualisation	Plotly (stacked areas, Sankey, sunburst, treemap), Highcharter (drill-down), ggiraph (hover heatmaps), echarts4r (animated races)
Interactivity	5 crosstalk-linked dashboards: SharedData groups linking plotly charts + reactable/DT tables via filter_select/filter_slider
Discard analysis	Species-level discard ratios, LO coverage flags, before/after comparison, exemption treemap hierarchy
ML pipeline	Hellinger transform → PCA (compositional) → silhouette-based optimal k → k-means → Sankey transitions
Data source	STECF FDI 2025 data call (EWG 25-10) + Annex 3 exemptions Excel (multi-sheet, merged-cell headers)
Output	Self-contained HTML (12 MB), all JS/CSS/data embedded, code-folded, floating TOC

Key Findings

Effort declining — fishing effort fell across most EU sea basins, with a marked Mediterranean drop post-2019 (West Med MAP trawler effort limits)
Discards persist under LO — ~4.6% EU-wide discard rate; bottom trawls (OTB) dominate. De minimis and survivability exemptions cover substantial catch shares
Fleet profiles stable — PCA + clustering shows most Mediterranean countries maintained their gear mix over 12 years; shifts are gradual
43% data confidential — confidential values disproportionately affect smaller countries and niche fisheries, limiting analytical granularity

DataScope — AI-Powered Data Analysis Platform

Agentic AI R/Shiny AI/Gemini

No-code analytics platform with an autonomous AI agent that explores your data without you clicking anything. The Auto-Discover module runs a full ReAct loop (Plan, Execute, Reflect, Narrate) — it generates hypotheses, runs statistical tests across 18 safe analytical actions, reflects on findings, iterates with follow-up analyses, and delivers an executive intelligence report with reproducible R code. Multi-table support with automatic join detection. No eval(), no parse() — the LLM can only trigger whitelisted statistical operations.

Three AI layers: an autonomous exploration agent (Gemini 2.5 Flash), a rule-based engine that works offline with no API key, and an LLM interpreter that synthesises results across all modules.

Around that AI core, the full analytics pipeline: automated EDA, parametric/non-parametric testing with auto post-hoc comparisons, 9 ML algorithms with full interpretability (DALEX, LIME, SHAP, PDP), PCA/clustering, publication-ready ggstatsplot output with embedded APA-style results, network analysis, pivot tables, and interactive drag-and-drop visualisation. 40+ R packages unified in a modular architecture with 160+ tests.

Stack: R · Shiny · bslib · caret · DALEX · lime · ingredients · Gemini 2.5 Flash · ggstatsplot · plotly · ggplot2 · GWalkR · esquisse · FactoMineR · reactable · igraph · networkD3 · testthat

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Framework	Shiny, bslib (Bootstrap 5, Flatly theme), responsive sidebar layouts, shinycssloaders
Tables	reactable (color-coded p-values, inline bar charts, significance stars), DT
Visualization	ggplot2, plotly, ggpubr, ggstatsplot (APA-style publication plots), esquisse (code export), GWalkR (Tableau-like), corrplot, VIM
Networks	ggraph, igraph (correlation networks), networkD3 (Sankey flow diagrams)
Statistical testing	rstatix, infer (permutation tests + bootstrap CIs), broom — auto-selected tests with post-hoc comparisons and assumption checks
ML pipeline	caret with 9 model types, preprocessing (imputation, NZV, correlated feature filtering, Box-Cox/Yeo-Johnson, PCA), k-fold/repeated CV/LOOCV/bootstrap, auto and custom tuning grids
Interpretability	DALEX permutation importance, PDP + ALE + ICE (ingredients), LIME, Break Down, SHAP, interactions + Johnson-Neyman, easystats diagnostics, sjPlot + modelsummary
Auto-EDA	skimr, dlookr, SmartEDA, DataExplorer — target analysis with effect sizes (Cohen’s d, eta-squared, Cramér’s V)
Classification	pROC (ROC curves), confusion matrix metrics, per-class performance, rpart.plot (tree visualization)
AI Assistant	Rule-based engine (pure R, no API) + Gemini API (free tier) via httr/jsonlite — guided analysis, auto-EDA pipeline, cross-cutting results interpretation
PCA & Clustering	FactoMineR + factoextra (PCA, FAMD, biplots, cos2 maps), k-means + hierarchical, Hopkins statistic, elbow/silhouette/gap analysis
Agentic AI	Custom ReAct loop (plan → execute → reflect → narrate), Gemini API, structured JSON action dispatch, iterative hypothesis testing
Multi-table	Automatic join detection (name matching, FK patterns, value overlap analysis), cross-table hypothesis generation, safe merge with collision handling
Start screen	Two-mode app architecture: Classic (13+ tabs, user-driven) and Auto-Discover (streamlined, AI-driven), renderUI-based mode switching with shared reactive state
Reporting	R Markdown HTML reports with saved analyses, plots, narratives, and model performance tables
Testing	testthat + testServer: unit tests, AI rules engine tests, module tests, E2E ML+interpretability tests, shinytest2 integration tests

Architecture Decisions

Custom ReAct agent over framework — built the iterative agent loop from scratch in R rather than using LangChain/CrewAI. The agent plans hypotheses, R executes them via a safe action catalog (no eval/parse), and the agent reflects on results to decide what to explore next
Predefined action catalog over code generation — the LLM returns JSON action descriptors, not code. 18 action handlers (correlation, ANOVA, PCA, clustering, cross-table merges) execute safely in tryCatch wrappers
Multi-table as a first-class feature — automatic join detection using column name matching, FK-pattern recognition, and value overlap analysis. Cross-table hypothesis testing is prioritized over within-table analysis
Two-mode architecture — start screen separates “you drive” (Classic) from “AI drives” (Auto-Discover) without duplicating code. Both modes share reactive state, so agent findings can be explored interactively in Classic mode
Two-layer AI intelligence — rule-based engine works offline and instantly; Gemini LLM layer (free tier) adds depth when available. The app never depends on an API key for core functionality
Modular Shiny design — 13+ independent modules (AI assistant, data upload, overview, target analysis, statsplots, stats, ML, interpretability, viz, network, pivot, PCA/clustering, predictions, report), each with isolated UI/server pairs
Shared reactive state — a single reactiveValues object flows between modules; the AI Assistant reads from shared state to collect results across all modules and provide cross-cutting interpretation
Parametric/non-parametric toggle — every statistical test has a non-parametric alternative (Wilcoxon, Kruskal-Wallis, Dunn’s, Spearman) with automatic effect size switching
caret as the ML backbone — unified interface across 9 model types with full preprocessing pipeline (imputation, NZV, correlation filter, transforms, PCA) built on caret’s preProcess
DALEX + ingredients for interpretability — model-agnostic explanations (importance, PDP, Break Down, LIME) that work identically regardless of the underlying model type
ggstatsplot for publication output — embeds test results, effect sizes, CIs, and Bayesian analysis directly in plot subtitles, following APA reporting standards
FactoMineR + factoextra for PCA/clustering — richer output than base R prcomp(), with automatic cos2/contribution extraction, FAMD for mixed data, and polished ggplot2 visualizations

excel2r — Excel-to-R Workbook Migration Package

R Package

R package that migrates entire Excel workbooks to R in one step. Upload any multi-tab .xlsx — the package extracts every formula, translates 62 Excel functions to R equivalents, resolves cross-sheet references via topological sort, exports raw data as tidy CSVs, and produces a standalone R script that uses only base R with zero dependencies. Built-in verification compares every computed value against Excel’s cached results before you commit to the migration.

One-liner API: excel2r::migrate("workbook.xlsx", "output/"). Also includes a 5-step interactive Shiny app (excel2r::run_app()). Installable via remotes::install_github("emantzoo/excel2r"). 150+ tests, CI on 5 platforms, vignette included.

Stack: R · tidyxl · openxlsx2 · Shiny · bslib

Live Demo GitHub Blog Post

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Package API	5 exported functions: migrate(), process(), verify(), supported_functions(), run_app()
Frontend	Shiny + bslib (Bootstrap 5, Flatly theme), DT for interactive formula review, 5-tab workflow
Parser	Custom balanced-parenthesis tokenizer, right-to-left safe replacement, string-literal awareness
Reference transform	Cell refs → R syntax (D10 → Sheet$D[10]), cross-sheet, absolute $ refs, whole-column ranges
Function registry	62 Excel functions mapped to R equivalents, per-function handler with error wrapping
Dependency engine	Kahn’s topological sort — two-level: sheet ordering first, then cell ordering within sheets
CSV export	Tidy long-format (row, col, value) per sheet, filtered to only formula-referenced cells, blank input slots preserved
Grid reconstruction	reconstruct_grid() helper embedded in generated script rebuilds positional data frame from tidy CSV
Verification	Runs generated script in isolated environment, compares R values vs Excel cached results cell by cell, classifies matches/precision diffs/real mismatches
Named tables	Auto-detection via openxlsx2::wb_get_tables(), header extraction, named data frame generation
Conditional aggregation	Custom SUMIF/SUMIFS/COUNTIF/COUNTIFS/AVERAGEIF/AVERAGEIFS helpers embedded in output (no external dependency)
Lookup helpers	VLOOKUP (exact + approximate match), HLOOKUP, INDEX, MATCH, XLOOKUP
Script generation	Self-contained .R output — Excel mode (reads .xlsx) or CSV mode (base R only, zero dependencies)
CI	GitHub Actions on macOS, Windows, Ubuntu (R release, devel, oldrel) — R CMD check with 0 errors, 0 warnings
Testing	150+ tests: unit (parser, transforms, dependency ordering, CSV export, verification), integration, API tests

Architecture Decisions

Package with clean API over monolithic app — 5 exported functions wrap 30+ internal functions. Users get migrate() for the common case, process() + verify() for control. Shiny app is optional (Suggests, not Imports)
Tidy CSV over grid dump — raw data exported as row, col, value triples filtered to only formula-referenced cells. Compact, readable, editable. Labels and decoration that no formula touches are excluded
Dual output mode — Excel mode for quick use, CSV standalone for full migration. Same formula code in both; only the data loading section differs
Grid reconstruction from tidy data — reconstruct_grid() helper is embedded in the generated script so it runs with zero dependencies. Rebuilds the same positional data frame (Sheet$Col[Row]) that formulas expect
Verification decoupled from output mode — verify always runs the Excel-mode script (since formula code is identical in both modes), comparing R-computed values against Excel’s cached results
Placeholder-based range extraction — ranges are replaced with <RANGE_N> tokens before single-cell transformation runs, preventing partial replacements (e.g., A1 inside A1:A10)
Bottom-up function transformation — innermost (smallest-span) function calls are transformed first, enabling correct handling of arbitrarily nested expressions like SUM(IF(A1>0,B1,0))
Dual data frame strategy for named tables — positional frames for formula execution, named table frames with real column headers for downstream analysis
Smart CSV filtering — extract_referenced_cells() scans all formulas, expands ranges, tracks whole-column refs, and outputs only the cells that formulas actually need. Empty cells within referenced ranges are preserved as fillable input slots
Zero external dependencies in CSV mode — generated script uses only base R. No packages to install, runs anywhere

Med Vessel Behaviour Monitor — Maritime Risk Intelligence

Python AI/LLM

Real-time behavioural risk intelligence dashboard for the Mediterranean Sea. Ingests AIS gap, encounter and loitering events from the Global Fishing Watch API and scores each event using a custom formula weighing duration, event type, flag-state risk and offshore location. 11 analytical tabs expose patterns from daily risk trends to encounter proximity analysis and gap speed profiling.

Features an embedded AI Maritime Analyst (Gemini 2.5 Flash) with RAG-injected domain knowledge covering IUU fishing, Mediterranean geography and flag-state risk context.

Stack: Python · Streamlit · Pandas · Plotly · Folium · GFW Events API · Google Gemini · RAG

Live Demo GitHub

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Streamlit (wide layout, 11 tabs), Folium interactive map, Plotly charts
Data ingestion	GFW Events API (async Python client), GeoJSON polygon filtering for Mediterranean
Risk model	Custom formula: (duration^0.75) × event_weight × flag_multiplier × offshore_bonus
Analytics	Duration histograms, flag×event heatmap, gap speed profiling, encounter proximity, Med zone breakdown, EEZ analysis, repeat offender detection
AI analyst	Gemini 2.5 Flash with RAG (4 knowledge docs), sandboxed code execution, safety checks
Data	23-column enriched dataset: vessel identity, distances, gap/encounter/loitering-specific fields
Deployment	Streamlit Cloud, secrets management, static fallback dataset

Architecture Decisions

Behavioral scoring over binary flags — rather than simple rule-based alerts, the risk model produces a continuous score combining multiple behavioral signals, enabling nuanced ranking and threshold-free analysis
RAG over fine-tuning for domain context — four markdown knowledge files (IUU context, flag risks, Med geography, methodology) are injected into the system prompt, keeping the LLM grounded without model customization
Sandboxed LLM code execution — AI-generated code runs in a restricted namespace with forbidden-import checks, preventing filesystem/network access while allowing full pandas/plotly capability
GeoJSON geometry over region IDs — the GFW API region parameter caused 422 errors; switched to explicit polygon geometry for reliable Mediterranean bounding
Static fallback with rich synthetic data — 80-row seeded dataset with 23 columns enables full dashboard functionality without API credentials, using verified sea-only coordinate zones

IAP Citrus — Integrated Assessment of Pesticide Impact

R/Shiny

Production web application for the PLANT-B research project (PRIMA grant, Benaki Phytopathological Institute) used by agronomists across 4 Mediterranean countries. Converts pesticide usage data into real-time environmental impact scores across biodiversity, ecotoxicology, pollution, and human health compartments — replacing a manual workflow of Excel spreadsheets and offline R scripts.

SQLite backend with 15+ tables and 9,000+ reference records, 15 Shiny modules with reactive state management, and Dockerised deployment on Google Cloud Run.

Stack: R · Shiny · bslib · plotly · ggplot2 · SQLite · Docker · Google Cloud Run

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	R/Shiny, bslib (Bootstrap 5), dark mode, CSS custom properties
Visualisation	plotly, ggplot2, reactable (React-based tables), DT
Database	SQLite (WAL mode) — 15+ tables, 9,000+ reference records (PPP products, active substances, ecotox thresholds)
Architecture	15 Shiny modules with reactive state management, shared scoring engine adapter pattern
Scoring engine	Custom R pipeline: IAP_PREP, triplet decomposition, weighted rank scoring, compartment aggregation with Scoring Impact Factors
Auth	shinymanager with encrypted credentials (disabled for public demo)
Deployment	Docker (rocker/shiny base), Google Cloud Run (europe-west1), Artifact Registry
Testing	testthat, testServer, shinytest2 (unit + integration + E2E)

Architecture Decisions

Adapter pattern for the scoring engine — the original scientific R functions are wrapped, not rewritten, preserving scientific integrity while enabling a modern web interface
Wide-format storage — column names are parameter IDs, allowing the scoring engine to consume data directly without transformation
Score caching with MD5 hash invalidation — scores are recomputed only when underlying data changes, keeping the UI responsive
Modular Shiny design — each feature is an independent module with its own UI/server pair, enabling parallel development and isolated testing

Replaced a manual workflow of Excel spreadsheets and offline R scripts with a unified web application that reduced the time from data entry to environmental impact assessment from hours to seconds.

Digital Maturity Assessment — Shiny Dashboard

R/Shiny AI/LLM

Interactive diagnostic dashboard that takes raw survey data from Greek businesses (Likert scales, checklists, ratings across 11 question blocks) and produces SWOT classifications, priority matrices, gap analyses, and implementation roadmaps. Three data-entry paths — Greek questionnaire, Excel upload, or AI conversational assessment.

15 reactive computations, three independent recommendation engines (rule-based, dynamic, priority-based), six interactive drill-down charts, and per-tab AI narrative generation via Cerebras LLM.

Stack: R · Shiny · bslib · ggplot2 · plotly · ggiraph · DT · Cerebras API · Docker

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Shiny + bslib (Bootstrap 5, Flatly theme), custom CSS, shinycssloaders, 12 modular UI components
Visualisation	ggplot2, plotly (click events + tooltips), ggiraph (interactive lollipop), DT, reactable
Analysis engine	15 reactive computations: attribute join, SWOT, dimension, maturity, priority, gap, roadmap, heatmap, bottleneck, risk, funnel, dependency, comparison. bindCache on heavy computations
Recommendations	Rule-based (business_rules.xlsx pattern matching), dynamic (score-category lookup), priority-based (quadrant extraction + ranking)
Narrative generation	4 template-based generators (100K+ lines of rule-based text), 11 tab-specific LLM prompt builders
AI assessment	Role-adaptive conversational scoring (CEO/CTO/Consultant personas), dimension-by-dimension prompts, JSON response parsing
LLM integration	Cerebras API, llama3.1-8b, temperature 0.3, 120s timeout, 2 retries with 5s backoff
Server architecture	Orchestrator pattern: app.R sources UI components, reactives module, outputs module, AI assessment module

Architecture Decisions

Modular UI decomposition — UI split into 12 separate files sourced at startup, keeping app.R as a pure orchestrator (~230 lines)
Reactive data store for runtime additions — reactiveVal as mutable data store; questionnaire, Excel, and AI results are bind_rows'd, making new companies immediately available across all 15+ computations
Three independent recommendation engines — rule-based catches threshold violations, dynamic matches score ranges, priority-based extracts actionable items from the quadrant matrix
Role-adaptive AI assessment — CEO responses scored generously for business items, CTO responses with technical precision, consultant responses as professional observations
Click-to-drill-down over embedded detail — click events open modal dialogs rather than embedding tables below every chart, keeping tabs scannable

Smart DMA — Adaptive Digital Maturity Assessment

R/Shiny

Replaces static questionnaires with a dynamic diagnostic engine for retail SMEs. Routes 44 questions adaptively toward the highest-scoring unconfirmed pain cluster using a 7-cluster Bayesian-inspired scoring model with sector priors — assessments complete in under 5 minutes.

19 solutions filtered by hard constraints (budget, time, support) before composite scoring, a 20-node digital twin process map with R-expression status rules, and three audience-specific outputs: plain language for SME owners, evidence tables for advisors, and SWOT/radar/roadmap for analysts.

Stack: R · Shiny · bslib · ggplot2 · plotly · fmsb · gt · pagedown · Docker · Cloud Run

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Shiny + bslib (Bootstrap 5), shinyjs auto-advance, custom CSS (choice tiles, symptom cards, progress indicators)
Scoring engine	7-cluster weighted accumulation, sector priors, flag-driven conditional routing, confirmation threshold logic
Adaptive routing	get_next_question: filter by condition/skip_if/maturity/sector, prioritize highest unconfirmed cluster, exit on 2 confirmed or 15 answered
Solution matching	Hard filters (budget, time, support, sector, excluded_if expressions) + soft composite scoring (pain x 0.4 + budget_fit x 0.3 + time_fit x 0.3)
Digital twin	ggplot2 tilemap, 20 nodes x 4 tiers, R-expression status rules, integration gap pair detection
Visualization	ggplot2 + plotly (process map, SWOT bars, gap bars, priority matrix, roadmap tiles), fmsb (radar), ggrepel
Reporting	pagedown::chrome_print for PDF, officedown for Word, flextable for formatted tables
Data model	373-column question CSV (13 base + 180 option weights + 180 flags), 19 solutions, 20 process nodes — all logic in CSVs
Testing	180 passing tests (5 unit + 3 browser), build_mock_state/build_completed_state fixtures, shinytest2 E2E

Architecture Decisions

Cluster-first adaptive routing — always routes toward the highest-scoring unconfirmed cluster, creating natural conversational flow. Exit conditions keep assessment under 5 minutes
Data-driven question logic via wide CSV — each question row carries 360 dynamic columns. Domain experts can tune the assessment by editing spreadsheet cells without touching code
Constraint-gated solution filtering before scoring — hard-excluded by budget/time/support/sector before composite scoring runs. Exclusion reasons surfaced in advisor report
Expression-evaluated node status rules — each process map node stores an R expression evaluated at runtime. Adding a new node means adding one CSV row
Separated engine from reactivity — R/engine.R contains pure functions (no Shiny dependencies), making the scoring pipeline independently testable
Three-audience output design — Output A: plain language for SME owners. Output B: evidence tables for advisors. Output C: analytical visualizations for consultants

Power BI Dashboards from Code — Zero Manual UI Work

Power BI Python AI/LLM

Code-first Power BI dashboard framework that generates complete multi-page dashboards as PBIR JSON — no drag-and-drop. 18 visual types (cards, gauges, scatter plots, waterfalls, ribbons, maps, matrices, and more), each defined in one line of Python that replaces 15+ lines of boilerplate JSON.

Includes a Claude skill for auto-generating dashboards from raw data: drop CSVs + a one-paragraph business brief, and it designs the star schema, writes DAX measures, and generates the full visual layout script. Four example dashboards (E-Commerce, Hospital, HR, Supply Chain) with 135 DAX measures.

Stack: Power BI · DAX · Python · PBIR/JSON · MCP Server · Claude Skill

GitHub

Technical Highlights

Layer	Technology
Visual generation	18 `make_*` Python functions generating PBIR visual.json files on a 1280×720 canvas
Data bindings	Measure field references (DAX measures) + column field references, cross-table bindings
Auto-generation	Claude skill: CSV inspection, star schema design, DAX measure writing, visual layout generation from a one-paragraph brief
Data model	Star schema with active/inactive relationships, Calendar table, _Measures table for all DAX
DAX patterns	LASTDATE snapshot, EARLIER self-join, USERELATIONSHIP, SAMEPERIODLASTYEAR, DATESINPERIOD, POWER, SUMX+RELATED, SWITCH RAG
Workflow	Phase 0–1 (data model via MCP or manual) → Phase 2 (Python visual generation) → Phase 3 (open and polish)

Herbicide Efficacy Analysis — Greenhouse Experiment

R R Markdown

End-to-end biostatistical analysis of a simulated greenhouse experiment (RCBD with repeated measures), demonstrating the full workflow from experimental design through mixed models, dose-response curves, and multivariate analysis in a single reproducible R Markdown report.

Covers mixed ANOVA with Dunnett’s test and effect sizes, negative binomial GLMM with simulation-based diagnostics, 4-parameter log-logistic dose-response modelling with bootstrap CIs, and MANOVA/PCA/PERMANOVA.

Stack: R · R Markdown · afex · emmeans · glmmTMB · drc · boot · FactoMineR · vegan · DHARMa · ggdist · gtsummary · bookdown

Live Report GitHub

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Experimental design	RCBD: 4 blocks × 6 treatments × 3 pots × 3 time points (216 obs), mock data via multivariate simulation
Univariate models	afex::aov_car (mixed ANOVA), glmmTMB (negative binomial GLMM), emmeans (Dunnett’s contrasts)
Effect sizes	effectsize: partial η² for ANOVA, Cohen’s d forest plot with 95% CI for treatment vs control
Multivariate	MANOVA (Pillai’s trace), FactoMineR PCA with factoextra biplot, vegan::adonis2 PERMANOVA
Dose-response	drc::drm LL.4 + W1.4 + W2.4, AIC model comparison, confidence bands, ED10–ED90 with delta CIs
Bootstrap	1000-resample nonparametric bootstrap for ED50 with percentile CI (boot package)
Diagnostics	DHARMa simulation residuals (GLMM), performance::check_model (ANOVA), QQ + Cook’s distance
Output	bookdown::html_document2 with floating TOC, numbered figures, gtsummary tables, DT interactive data

Architecture Decisions

Mock data by design — transparent set.seed(42) simulation with correlated multivariate responses, block/pot random effects, and realistic missingness; shareable without confidentiality concerns
GLMM over ANOVA for counts — true leaf count modeled with negative binomial (not normal ANOVA) to respect the data-generating process
Dunnett’s over Bonferroni — purpose-built many-to-one comparison is more powerful when all tests compare to a single control
Bootstrap + delta CIs — both analytical and resampling-based uncertainty for ED50, demonstrating awareness of small-sample limitations

agrideshr — Agricultural Experimental Designs in R

R R Package

Tutorial R package with a pkgdown website covering five classical experimental designs used in agricultural research (CRD, RCBD, Latin Square, Factorial, Split-Plot). Each design includes a built-in dataset, a detailed vignette with full analysis pipeline, and a helper function for assumption checking.

Every vignette walks through EDA, model fitting, assumption checks, Tukey HSD, and emmeans contrasts. R CMD check passes with 0 errors, 0 warnings, 0 notes.

Stack: R · ggplot2 · emmeans · lmerTest · pkgdown · GitHub Actions

pkgdown Site GitHub

Read more — Designs & Analysis Pipeline

Experimental Designs

Design	Model	Use Case
CRD	`aov(Y ~ A)`	Homogeneous units, single factor
RCBD	`aov(Y ~ Block + A)`	One blocking factor
Latin Square	`aov(Y ~ Row + Col + A)`	Two blocking factors (grid)
Factorial	`aov(Y ~ A * B)`	Crossed factors + interaction
Split-Plot	`lmer(Y ~ A*B + (1\|Rep:A))`	Restricted randomisation

Each Vignette Covers

When to Use — decision criteria for choosing the design
Exploratory Visualization — ggplot2 boxplots, interaction plots
Model Fitting — aov() for fixed designs, lmer() for split-plot
Assumption Checking — Q-Q plot, Shapiro-Wilk, Levene’s test via custom check_assumptions()
Post-hoc Comparisons — Tukey HSD and emmeans pairwise contrasts with CLD

Minecraft Education — Algorithm Games

Python MakeCode

Two educational games for Minecraft Education Edition that make classification and search algorithms tangible through gameplay. An AI agent sorts animals using fine-grained vs coarse classification strategies, then navigates a procedurally generated reward grid comparing Random Walk, Greedy, and memory-augmented search — each strategy adding one improvement, making algorithmic thinking incremental.

Includes a self-contained React browser demo (no Minecraft licence needed), Fisher-Yates shuffle adapted for MakeCode constraints, and XOR-based backtrack prevention.

Stack: MakeCode Python · Minecraft Education Edition · React (browser demo)

Live Demo GitHub

Read more — Technical Details & Game Design

Technical Highlights

Layer	Technology
Game 1: Sorting	Agent patrols 50 spawned animals, sorts by type (lucky1: 5 pens) or category (lucky2: FARM/EXOTIC)
Game 2: Search	10×10 grid with Stone (+3), Diamond (+5), Lava (-5), Gold (+10); 4 strategies compete
Backtrack prevention	XOR^1 on direction indices (0↔1, 2↔3) gives opposite direction in O(1)
State tracking	Visited blocks transform to colored wool, giving real-time visual trail of agent’s path
Shuffle algorithm	Fisher-Yates adapted for MakeCode Python (no len(), no range())
Path evaluation	test_paths runs 5 trials from fixed start with random detours, computing reward/step ratios
Browser demo	Self-contained HTML with React 18 + Babel standalone from CDN, no build step

Educational Design

Tangible algorithm comparison — students watch agents make decisions in real time, comparing random vs greedy vs memory-augmented strategies on identical grids
Classification granularity tradeoff — lucky1 (5 types) vs lucky2 (2 categories) demonstrates precision vs speed in data categorization
Visual state tracking — wool markers make visited cells immediately visible, building intuition for graph traversal and state management
Progressive complexity — Random Walk establishes baseline, each subsequent strategy adds one improvement, making algorithmic thinking incremental

Statistical Disclosure Control — Streamlit App

Python AI/LLM

Production anonymization engine that protects individual-level datasets against re-identification using four methods (k-Anonymity, Local Suppression, PRAM, Noise Addition), selected automatically by a 40+ rule decision engine. Backward elimination risk analysis drives every downstream choice — from variable classification to per-QI protection parameters.

Features an adaptive retry loop with escalation and cross-method fallbacks, a composite utility score (Pearson correlation, KL divergence, optional ML validation), and optional AI-powered column classification via Cerebras Qwen 235B. Tested against 17 real-world Greek administrative datasets.

Stack: Python · Streamlit · Pandas · Plotly · scikit-learn · R/sdcMicro · Cerebras API

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Streamlit, custom CSS, session state management
Visualisation	Plotly (risk histograms, variable importance bars, before/after overlays)
Risk engine	Custom pipeline: per-record ReID, backward elimination, structural risk, variable importance ranking
Protection engine	4 methods, 40+ selection rules, dynamic pipeline builder, multi-phase retry with escalation + fallbacks
Method selection	Rules engine (RC, CAT, LDIV, DATE, QR, LOW, DP, HR rule families), suppression-gated kANON
Preprocessing	Type-aware routing (6 priority tiers), adaptive tier loop (light to very aggressive), risk-weighted per-QI cardinality limits
Privacy metrics	k-anonymity, l-diversity (distinct + entropy), t-closeness (EMD/TVD), uniqueness rate, disclosure risk
AI integration	Cerebras Qwen 235B for column classification and method recommendation (optional)
R integration	sdcMicro for optimal local suppression and correlated noise (optional, Python fallback)
Testing	pytest (unit + integration), 17-dataset test suite covering real-world Greek administrative data

Architecture Decisions

Backward elimination as the foundation — per-variable risk contribution drives everything downstream: QI classification confidence, preprocessing aggressiveness, per-QI protection parameters, LOCSUPR importance weights, and GENERALIZE ordering
Suppression-gated method selection — before selecting kANON at any k value, the engine pre-estimates suppression rate from equivalence class sizes. If estimated suppression exceeds 25%, it switches to LOCSUPR or PRAM directly
Type-aware preprocessing before generic generalization — dates, ages, geography, and skewed numerics each get domain-specific transformations before the generic cardinality-reduction loop runs, preserving domain structure
Sensitive-column-scoped utility — utility is measured primarily on sensitive (analysis) columns, not QIs. The composite score reflects what downstream analysts actually care about
Proportional low-cardinality QI guard — columns with very few unique values relative to dataset size are demoted from QI status via a two-tier ratio test, preventing structural methods from suppressing 90%+ of their values

Policy Intelligence Platform — Django + HTMX

Django Python AI/LLM

Policy analytics platform tracking the EU Council Recommendation on Fair Transition across 27 member states. Ingests 1,000+ policy measures from MongoDB and provides 14 interactive visualisations with 11 simultaneous filter dimensions — every AI prompt receives the active filter state so responses always reflect what the user is looking at.

Includes a RAG-based semantic policy search (MongoDB vector search, top-5 retrieval with citations), a 6-section strategic intelligence framework, and AI-powered gap analysis. Deployed on Cloud Run with graceful degradation — the dashboard stays fully functional even without an API key.

Stack: Python · Django · HTMX · MongoDB · Plotly · Tailwind CSS · Cerebras API · Cloud Run

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Django 5 + HTMX (partial page loads, no SPA), Tailwind CSS, responsive grid layout
Visualisation	14+ Plotly chart types (choropleth, bar, pie, heatmap, stacked bar), responsive sizing
Data pipeline	MongoDB aggregation pipelines with allowDiskUse, QueryBuilder mapping 11 filter dimensions to $match/$unwind/$regex stages
AI integration	8 generation functions: chart narratives, document analysis, in-depth analysis (5 focus modes), Q&A, strategic analysis (6 sections + synthesis), batch/country summaries, gap analysis
RAG pipeline	Query embedding, MongoDB $vectorSearch, top-5 retrieval, metadata enrichment, LLM answer with source citations
Prompt engineering	8 specialised system personas, filter-aware context injection, <think> tag stripping for reasoning models
Caching	Django LocMem cache with namespace keys (ai:{type}:{md5}), filter-aware invalidation, 1-hour TTL
Deployment	Docker (python:3.12-slim), gunicorn (2 workers + 4 threads), Cloud Run (512MB, auto-scale 0-2), MongoDB Atlas
Seeding	seed_mongo.py generates 300 docs / 1,050 measures / 80 linkage groups across 14 EU countries

Architecture Decisions

Filter-scoped AI context — every AI prompt includes the active filter state as a human-readable preamble, so LLM responses reflect the user's current view
Deduplication at query time — linkage collection maps duplicates to canonical IDs. Matched canonicals expand to include all duplicates; when displaying, duplicates are collapsed
Document-grounded deep analysis — in-depth analysis uses the full parsed document text (up to 25K chars) rather than just metadata, enabling the LLM to cite specific provisions
Graceful AI degradation — when no API key is configured, all AI endpoints return informative placeholders, keeping the dashboard fully functional for data exploration
Slim Docker image — copies only the 6 modules Django actually imports, keeping the image under 300MB

PD Gait Analysis Agent — LLM-Powered Clinical Reasoning

Python AI/Gemini

A clinician asks a question in plain English; the agent writes Python code to analyse wearable sensor gait data, executes it in a sandbox, and returns a clinically contextualised answer with full reasoning trace. Built on the PHIA pattern (Nature Communications, Jan 2026) — the first application of this approach to PD gait monitoring data.

Custom ReAct loop (no LangChain), MongoDB Atlas vector search over 17 clinical knowledge chunks, and synthetic data modelling realistic PD subtypes with medication wearing-off and freezing episodes. Entire stack runs on free-tier services at zero cost.

Stack: Python · Streamlit · Gemini 2.5 Flash · MongoDB Atlas · sentence-transformers · Pandas · NumPy

Live Demo GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
LLM	Gemini 2.5 Flash (Google AI Studio, free tier)
Agent framework	Custom ReAct loop (no LangChain dependency) — prompt parsing, tool dispatch, iteration control
Code execution	Sandboxed Python with pre-loaded pandas DataFrame and patient profile dict
RAG	MongoDB Atlas vector search, all-MiniLM-L6-v2 embeddings (sentence-transformers, runs locally), 17 clinical chunks
System prompt	~35k chars assembled from role description, clinical knowledge, data schema, patient profile, 6 few-shot ReAct trajectories, tool descriptions
Data	Synthetic gait data: step_length, stride_time, cadence, stride_variability, asymmetry_index, freezing_flag, medication_state, hours_since_dose
Frontend	Streamlit with patient selector, context cards, example question buttons, expandable reasoning trace

Architecture Decisions

PHIA-inspired ReAct pattern — agent reasons, acts (code or RAG), observes, and iterates; produces verifiable computation rather than hallucinated statistics
Custom agent loop over LangChain — full control over prompt assembly, tool dispatch, and iteration limits without framework overhead
Embedded clinical knowledge in system prompt + RAG — critical thresholds and scoring criteria are always available; RAG supplements with deeper guideline details on demand
Synthetic data with clinical realism — medication wearing-off patterns, progressive deterioration, freezing episodes, and PD subtype signatures allow meaningful agent evaluation without real patient data
Zero-cost stack — Gemini free tier, MongoDB Atlas M0, local embeddings; total running cost $0

KBforge — Knowledge Base Builder for LLM Retrieval (RAG)

Python AI/Gemini

Streamlit app that turns domain literature into production-ready knowledge bases for LLM retrieval-augmented generation (RAG). Ingest evidence from PDFs, PubMed, or structured JSON — every chunk is embedded, tagged with domain features, and stored in a ChromaDB vector index that any RAG pipeline can query with semantic search and metadata filtering. No coding required.

Evolved from a hardcoded Alzheimer’s research pipeline into a fully configurable tool where domain experts swap vocabularies, not code. Smart deduplication with cosine-similarity calibration merges near-duplicates without losing tags. Coverage gap tracking shows exactly where your KB is thin before your LLM starts hallucinating. Auto-generated extraction prompts let users paste papers into ChatGPT/Claude/NotebookLM and import the structured output directly.

Stack: Python · Streamlit · ChromaDB · sentence-transformers · Pydantic · Gemini API · PyMuPDF · PubMed E-utilities

GitHub Private repo — available on request

Read more — Technical Details & Architecture

Technical Highlights

Layer	Technology
Frontend	Streamlit multipage app (Setup, Add Sources, Knowledge Base), session state management
Data models	Pydantic v2: Chunk, SourceInfo, ProjectConfig — all pipeline modules receive ProjectConfig, no hardcoded vocabularies
Ingestion	Three sources: JSON import with validation, PDF upload (PyMuPDF section detection + overlapping chunks), PubMed abstract search (NCBI E-utilities, free tier)
LLM extraction	Gemini 2.5 Flash via google-genai: dynamic tagging prompts built from ProjectConfig.features, structured JSON output mode
Prompt generation	Auto-generated extraction prompts (Prompt A/B) from project config — users paste papers into external LLMs and import the JSON output
Embeddings	sentence-transformers (all-MiniLM-L6-v2 default, configurable), generic embedder accepts any model
Vector store	ChromaDB with persistent SQLite backend, HNSW index, local — no cloud setup required
Deduplication	Cosine similarity with tag merging (not discard). Calibrator shows similarity histogram, percentile stats, threshold impact table, top-N similar pairs
Coverage	Per-feature chunk counts, coverage type breakdown, gap warnings, min_chunks_per_feature threshold
Normalization	Fuzzy feature-name matching against ProjectConfig vocabulary (handles typos, case variations)

Architecture Decisions

ProjectConfig everywhere — every module receives a Pydantic ProjectConfig object; no hardcoded feature lists, coverage types, or model names. Switching domains means changing one config, not refactoring code
Calibration before deduplication — different embedding models produce different similarity distributions. The calibrator shows the actual distribution so users pick a threshold grounded in their data, not a magic number
Tag merging over discard — when chunks are near-duplicates, tags from both are merged into the survivor. Information is preserved even when text is deduplicated
Three-path ingestion for different workflows — JSON paste for power users, auto-generated prompts for LLM-assisted extraction (no API key needed), PubMed + PDF for automated bulk ingestion

Spark Athens — Social Nightlife Platform

React

Multi-sided platform for Athens nightlife that connects three audiences through role-based interfaces. Users discover events happening tonight, see who’s going before they commit, and match with people at the same venue. Venues get a management dashboard for promotion, attendee tracking, and talent booking. Artists find gigs and build their audience through event integration.

Squad matching lets friend groups find events together and discover other groups going to the same place. Designed around cross-role network effects — each new user, venue, or artist makes the platform stronger for everyone else.

Stack: Supabase · React

GitHub Private repo — available on request

Athens Events Hub — Professional Event Networking for SMEs

React

B2B event networking platform designed for Greek SMEs attending professional conferences, trade shows, and workshops. Solves the biggest pain point of business events: walking in blind. Attendees publish structured profiles with explicit intent (looking for / offering), browse other attendees before the event, and schedule qualified meetings in advance.

Organisers get attendee analytics, demographic breakdowns, and engagement tracking. Built around measurable ROI — every interaction is trackable, so SMEs know whether an event was worth attending. Aligned with EDIH digital transformation priorities for the Attica region.

Stack: Supabase · React

GitHub Private repo — available on request

Projects

RegScope — Autonomous EU Regulatory Intelligence

Technical Highlights

Architecture Decisions

EU FDI Explorer — Interactive Fisheries Analysis

Technical Highlights

Key Findings

DataScope — AI-Powered Data Analysis Platform

Technical Highlights

Architecture Decisions

excel2r — Excel-to-R Workbook Migration Package

Technical Highlights

Architecture Decisions

Med Vessel Behaviour Monitor — Maritime Risk Intelligence

Technical Highlights

Architecture Decisions

IAP Citrus — Integrated Assessment of Pesticide Impact

Technical Highlights

Architecture Decisions

Digital Maturity Assessment — Shiny Dashboard

Technical Highlights

Architecture Decisions

Smart DMA — Adaptive Digital Maturity Assessment

Technical Highlights

Architecture Decisions

Power BI Dashboards from Code — Zero Manual UI Work

Technical Highlights

Herbicide Efficacy Analysis — Greenhouse Experiment

Technical Highlights

Architecture Decisions

agrideshr — Agricultural Experimental Designs in R

Experimental Designs

Each Vignette Covers

Minecraft Education — Algorithm Games

Technical Highlights

Educational Design

Statistical Disclosure Control — Streamlit App

Technical Highlights

Architecture Decisions

Policy Intelligence Platform — Django + HTMX

Technical Highlights

Architecture Decisions

PD Gait Analysis Agent — LLM-Powered Clinical Reasoning

Technical Highlights

Architecture Decisions

KBforge — Knowledge Base Builder for LLM Retrieval (RAG)

Technical Highlights

Architecture Decisions

Spark Athens — Social Nightlife Platform

Athens Events Hub — Professional Event Networking for SMEs

Skills & Tools

Languages

Data & BI

Web & Backend

AI / ML

DevOps & Cloud

Domain Expertise

Certifications

Google

Microsoft

UC Davis

Education

PhD

MSc

BSc