Anticipated critical objections — companion FAQ

Companion to Arkadium · Paper v1.11 · Jordi Berenguer Rodrigo · Opengea SCCL

Updated 2026-05-17 · maintained as a separate page so the paper remains focused on the positive contribution while this FAQ can grow with reader engagement.

This page collects thirteen critical objections to the Arkadium proposal — the kind of questions a serious external reviewer in AI safety, alignment, or philosophy of AI would reasonably raise — together with detailed responses.

Most of these objections already find scattered responses in the body of the paper; here they are grouped for ease of consultation, and new objections can be added over time without bumping the paper's version.

Objection 1. "Why 8 quadrants and not more or fewer?": Three considerations justify the granularity (see paper §4.1): structural (8 = 2³ + radial covers the fundamental philosophical distinctions), cognitive (Miller 1956: 7±2 elements simultaneously inspectable by humans), and operative (optimal granularity between collapse of distinctions and statistical sparseness). The 6,400-metacategory level offers fine granularity without abandoning the 8 quadrants as projection.
Objection 2. "It is not an empirical study but a proof of concept": Correct. Paper §9.3.b explicitly acknowledges this and describes the V1.1 validation plan (100 questions × 5 conditions × 3 human annotators, planned for the next 6 months). Claims about 𝓗 as quality indicator remain, until then, as theoretically grounded but empirically unvalidated hypotheses. Arkadium is reference architecture, not production product (paper §13).
Objection 3. "The 80 categories are culturally biased": The architecture is, yes: the Globàlium is Catalan philosophical heritage developed by Xirinacs (paper §1). The axes, however, are not — OBJ-SUB, TEO-PRA, FEN-NOU, PLA-MON are distinctions present in European phenomenology, German idealism, hermeneutics, and Zen Buddhism. The genealogy is cultural; the axes are operative. This does not exclude other traditions developing variants of the Meta-Globàlium with categories adjusted to their priorities — the system is precisely provisional and revisable by design.
Objection 4. "Reward hacking is equally possible if the model knows the quadrants": Partially true, and the answer has had three empirical phases worth documenting (see paper §3bis for the full cycle). Phase v1: the first version of the verifier (the 𝓗 coverage-and-entropy metric) was indeed gameable through structure alone — a text with eight cardinal-titled headings and one neutral sentence under each saturated 𝓗 ≈ 1.0. Phase v2 (deployed 2026-05-07): the 𝓦 metric was extended with two positive components — axis_explicit and subordinating_synthesis — and rebalanced so that a list-of-headings text now scores 𝓦 ≈ 0.15 against 𝓦 ≈ 0.80–0.95 for a genuinely dialectical response. Phase v3 (deployed 2026-05-17): integrated mereological_coverage as the eighth component (weight 0.15) to cover a third Goodhart — the possibility of operating the entire dialectic in self-identity mode (A=A) without exercising the other three canonical Part-Whole relations (inclusion, containment, correlation) that Xirinacs § 422 defines. Full empirical justification at docs/wisdom-score-design.md §3bis–§3ter. But the relevant fact for this objection is the architectural lesson: every time the metric becomes robust to one failure mode, another emerges. Between v1 and v2 we identified a second Goodhart within 24 h of deployment: structure without wisdom. The answer to this second form of gaming is out of band: we deployed (i) a user-facing parameter, the escope (paper §5.bis), which moves the response between three registers aligned with the radial PLA-MON pulse; and (ii) a wisdom-polish second pass that separates doing the dialectical work from saying it well. Full specification at docs/escope-parameter-design.md. The architectural conclusion is that robustness to reward hacking is not a static property of a metric but an evolutionary line: each generation of the verifier anticipates known failure modes, defines new probes, and the combination metric + prompt + UI collectively covers what the metric alone cannot. The empirical validation (paper §9.3.b) is designed precisely to quantify the correlation of 𝓦 v3 + escope=0 + polish against human annotation.
Objection 5. "The manifest is Catalanocentric": The cited genealogy (Llull → Sibiuda → Pujols → Xirinacs) is Catalan, yes. This choice is documentarily grounded and does not aspire to substitute other global philosophical traditions. That the model's poles are philosophical universals (subject/object, theory/practice, phenomenon/noumenon) implies that the same architecture can be articulated with different genealogies — Madhyamaka, Vedanta, Taoism, Hegelianism. The Meta-Globàlium does not claim cultural appropriation but operationalization of an intuition common to multiple integrative traditions.
Objection 6. "The dialectical principles are not mathematically original": True, and the manifest acknowledges this explicitly (paper §4.4, note on originality). The 𝓗 formula combines standard metrics from information theory; the six principles reformulate dialectical contents present in the philosophical tradition. The original contribution is architectural: the displacement of the locus of verification from a textual constitution interpretable by the model itself to an external ontological geometry, computable as objective structural property (paper §4.2).
Objection 7. "The hypersphere is metaphor, not operative geometry": The projection is mathematically defined and computationally implemented (paper §4.1). The 4D hypersphere is not dead letter: each category has assigned Cartesian coordinates, projection to the 8 primary quadrants is a mechanical operation, and the 𝓗 metric is computed explicitly at runtime on each response. The verifier code is accessible and auditable (paper §9.4). The geometric metaphor is a visualization; the geometry is operative.
Objection 8. "It is not proven that high 𝓗 = human quality": No, not yet proven. This is the most serious methodological limitation, acknowledged explicitly (paper §9.3.b). The V1.1 study plan has precisely this correlation as central hypothesis, measured with double-blind human annotation on five dimensions of quality. Until the results of this study, the manifest claims 𝓗 as an objective structural property (non-omission of dialectical poles), not as a validated metric of human quality.
Objection 9. "Xirinacs is not a recognized academic figure in current AI alignment": The thesis A global model of reality (1997) is peer-reviewed: defended at the University of Barcelona with academic committee. That Xirinacs is also a Catalan public figure known for other reasons does not weaken the academic quality of the thesis, which was published in the UB repository and remains a current reference. The manifest cites the thesis as academic work, not the public person.
Objection 10. "What happens when the question has no genuine dialectical structure?": Good question. For simple factual questions ("What is the capital of France?") dialecticity is spurious and 𝓗 contributes no value — the correct answer touches a single pole (OBJ) and that is appropriate. The structural verifier is designed for human domains where access plurality is constitutive (ethics, politics, social judgment, public deliberation). The volta selection by question type (paper §4.5.b) anticipates this scenario: the volta of application operates on reasoning; the volta of knowledge on learning; the volta of orientation on personal direction and meaning. A prior router identifies the question type and selects the appropriate volta — roadmap functionality (paper §12), not yet implemented.
Objection 11. "Why this specific ontology and not another? The 80 categories look arbitrary.": Two answers, structural and empirical. Structural: the four axes (SUB↔OBJ, TEO↔PRA, FEN↔NOU, PLA↔MON) are reflective universals attested across philosophical traditions — European phenomenology, German idealism, hermeneutics, Madhyamaka, Vedanta, Taoism, Zen. The 80 categories are derived combinatorially from these axes (8 → 26 → 80 by canonical projection) — granularity calibrated to Miller's 7±2 at each level. Other reflective vocabularies can articulate similar axes; we expect them to converge structurally even when the labels differ. Empirical: paper §9.3.b.bis specifies the construct-validity panel that tests this precisely — if the ontology fails inter-rater κ ≥ 0.6 with external philosophers, ethicists, and AI-safety researchers, the claim of meaningful structure is empirically rejected, and we say so in advance. The architecture is provisional and revisable by design (paper §13); the test is the construct-validity panel, not a stipulation.
Objection 12. "Does the ontology transfer outside its cultural origin? What about non-Western domains?": Not yet tested empirically; this is an acknowledged boundary condition (paper §13). The hypothesis is differentiated: the axes are reflective universals expected to transfer (subject/object, theory/practice, phenomenon/noumenon are not Western patents); the category labels (BEL, COS, IDE…) carry European philosophical genealogy and may benefit from re-labeling by other traditions. The fractal architecture is precisely designed to allow this: thematic variants A/B/…/V (paper §12 roadmap) include cross-cultural adaptations as natural extensions. Cross-lingual production is already underway (CA + EN; ES dropped 2026-05-08 by scope decision, not by impossibility). Future work explicitly includes cross-cultural construct-validity panels — but Phase 1 is honestly framed as a Western-European-rooted operationalization, not a cross-cultural universal.
Objection 13. "How do you prove that R̂ won't collapse to a single metric (e.g. learn 𝓗 well and ignore 𝓦 and 𝓕)?": Empirical concern, addressed by three layered defenses. (i) Calibration gate: R̂'s fidelity to the symbolic stack is tested per-component (Spearman ρ ≥ 0.85 on each of 𝓗, 𝓦, 𝓕, 𝓜, causal coverage, SD-WISE), not aggregated. A model that learns one component at the expense of others fails the gate and triggers redesign before any training compute is committed (Foundation milestone, paper §12.3). (ii) Dual-criterion break in the loss: the training-time reward applies the same logical-AND structure as the inference-time verifier — gradient flows only when all three of 𝓗 ∧ 𝓦 ∧ 𝓕 clear their thresholds simultaneously, so optimizing any single component yields no signal. (iii) Goodhart-resistance experiment (E4): paper §12 explicitly trains an adversarial condition that tries to collapse R̂ to a single metric, measures the accuracy cost (gap-to-evade), and reports the result pre-registered. If R̂ does collapse despite (i) and (ii), E4 quantifies how much capability the gaming model must sacrifice — and that null result is itself publishable empirical evidence about boundary conditions of multi-criterion process rewards.

If you have an objection not on this list, please write to jordi@opengea.org with the objection and the line of reasoning behind it — strong objections get added here with a documented response. The page is meant to grow.