Six sub-areas in total. In each, the same two functional roles are distinguished: a material to be inverted and a functional receiver. Line A (the frontier method) is presented first: in A·1, A·2, and the TRAG foundation, the material is a RAG + LLM over a domain corpus, the receiver is a physical or process model. Line B (the consolidated SciML method) follows: in B·1, B·2, B·3, the material is data + system knowledge, the receiver is a physical model with neural closure.
Line A reformulates the inverse problem as LLM-orchestrated RAG-mediated parameter recovery: a complex simulator or regulatory corpus is the system to invert, and a hybrid syntactic-vector RAG under LLM orchestration is the informational regulariser. The two applied projects converge on the sub-alpine Italian lake system — Lake Maggiore and Lake Varese — exploring its biological-demographic and physical-biogeochemical strata respectively.
The architectural prototype from which the two Line A applied projects derive. Its operational maturity demonstrates the feasibility of LLM-orchestrated RAG architectures on a real, high-value, structured corpus.
Hybrid syntactic-vector RAG indexed on Elasticsearch. Lexical matching on structured fields combined with dense-vector retrieval. Python application layer, Open WebUI conversational front-end, LLM inference via external API.
~3,200 European Medicines Agency documents (~1,600 Product Information + ~1,600 European Public Assessment Reports). Average length ~50 pages. All documents carry legal weight.
Direct lineage from the founding nucleus's work at the EMEA London (1995-2003) — the same regulatory corpus, reread with the apparatus of the 2020s.
Recovery of the demographic parameters of the cladoceran population of Lake Maggiore through a compartmental ODE model, where the egg age-structures required for parametric estimation are generated by a specialised RAG under LLM orchestration.
30+ years of zooplankton data and documentation accumulated by Marina Manca: CIPAIS long-term monitoring, CNR-IRSA Verbania archives, individual scientific notebooks, protocols, grey literature on the lake.
Specialised RAG over the Manca corpus, modelled on TRAG. Under LLM orchestration the system assembles the egg age-structures required by a McKendrick-Von Foerster compartmental ODE population model; the model estimates demographic parameters with the RAG-generated structures closing the inverse problem.
A·1 and B·1 operate on the same substrate with complementary apparatuses. B·1 uses PINNs against time-series; A·1 uses RAG-LLM against documentary evidence. Designed to be combined.
Calibration of the General Lake Model + Aquatic EcoDynamics module (GLM-AED2) for Lake Varese: recovery of physical and biogeochemical parameters from a heterogeneous evidence base mediated by a specialised RAG under LLM orchestration.
GLM-AED2 (Aquatic Ecodynamics Research Group, University of Western Australia) — 1-D water-balance + stratification, coupled with the AED2 module library for biogeochemistry and ecology. Simulates temperature, oxygen, nitrogen, phosphorus, organic matter, phytoplankton.
Specialised RAG over Lake Varese documentation and data. Under LLM orchestration the system generates the parametric profiles (initial conditions, boundary fluxes, rate constants, calibration targets) required by GLM-AED2.
GLM-AED2 calibration is a recognised problem with existing tools. The Institute's contribution is the RAG-LLM regularising layer that uses heterogeneous documentary evidence to constrain the parametric search.
Methodological dialogue with King's College London + Imperial College London. Contact: Oliver Perkins (KCL), author of WHAM! (Wildfire Human Agency Model, Geoscientific Model Development, 2024) and of its coupling with JULES-INFERNO within the UK Earth System Model for CMIP7.
Methodological exchange on simulator + RAG + LLM fusion. IIPS contributes the regulatory-corpus + RAG-LLM architecture lineage from TRAG and the limnological inverse-problem tradition; the Leverhulme Centre contributes the climate-fire simulator infrastructure and surrogate-modelling experience at planetary scale.
The Institute does not pursue the wildfire application domain itself. The exchange is strictly methodological. The Institute's applied Line A projects are A·1 and A·2 on the sub-alpine Italian lakes.
Each Line B project takes a 1970s-1980s paper of the founding lineage in which an inverse problem was formulated rigorously but solved with the regularisation tools available at the time (Tikhonov, Kalman, mass-balance closure). Each project preserves the symbolic constraint of the original formulation and substitutes the empirical functional component with a contemporary neural approximator.
From Memorie IIDr 45 (1987) to PINN-based birth-rate inversion.
Argentesi, de Bernardi, Di Cola & Manca (1987) — comparative analysis of estimators for cladoceran birth rates. Argentesi 2026, under review at Journal of Limnology — closed-form bias structure.
McKendrick-Von Foerster age-transport + PINN for non-mechanistic closure + deep image classifiers on plankton microscopy as the modern observational channel.
From NUMSAS (Brussels 1979) to the ESARDA MBE Working Group retrospective.
Argentesi, Casilli & Franklin (ESARDA Brussels 1979); Argentesi, Hafer, Markin & Shipley (INMM Vail 1982 with LANL); Beedgen, Argentesi & Avenhaus (INMM 1987 PROSA).
JRC-NUKEM Hanau · JRC-BNFL Risley · JRC-Reading (Curnow/Woods, source of SITMUF).
State-space neural networks and normalising flows on the residual structure of fissile material balances under correlated measurement noise.
From Merlini 1971 to anatomically-informed GNN compartmental inversion.
Merlini 1971 — Zn-65 compartmental physiology in Lepomis gibbosus. Margaret Merlini acknowledged In Memoriam.
Classical compartmental inversion treats organs as undifferentiated boxes. The biological reality is anatomically structured — graph topology with physiological modulation.
Anatomical graph (organs as nodes, vascular/lymphatic connections as edges) provides the symbolic constraint. GNN learns the functional closure of inter-organ exchange rates as functions of physiological covariates.
In each project, a system with explicit symbolic structure is the system to invert; the empirical component the classical formulation could not specify is supplied by a contemporary information-processing system — a neural network for Line B, an LLM-orchestrated RAG for Line A. The geometric centre of the Institute's applied work is the sub-alpine Italian lake system, where B·1 and A·1 share Lake Maggiore and A·2 extends to Lake Varese.