ProsperPath Experiment Feasibility

Available assets

ProsperPath already contains enough material to support a narrow, initial evaluation of context construction strategies for SME-style decision support.

Raw business-like inputs exist in mock_data/, including:
- chat.json with multi-session owner-agent discussions covering sales decline, staffing gaps, inventory sync, supplier MOQ pressure, and other operational issues.
- ratio.json with period-level financial metrics.
- user_memories.json with owner context, preferences, and business constraints.
- loca_trends.json with market trends, competitor actions, and regulatory/grant signals.
Derived artifacts already exist in ProsperPath:
- 1001_output.json with decision factors, trace scores, and factor summaries.
- 1001_recommendations.json with recommendation traces, signal summaries, and action suggestions.
Existing schema coverage is broad enough for a pilot:
- schema/kg_schema.py and schema/README.md show support for Owner, OwnerCompany, Goal, Challenge, Crisis, Decision, Action, FinancialPeriod, CompetitorAction, MarketTrend, RiskEvent, and Recommendation.
Existing query and evidence paths are usable:
- pp_query/service.py exposes search strategies through search(...).
- pp_query/schemas.py defines node_rrf, edge_episode, and combined_rrf.
- decision_feat/graph_signals.py exposes graph-based evidence signals such as mention count, relationship strength, and action connectivity.

Read-only Neo4j inspection for group_id = user_1001 confirmed non-trivial graph coverage.

75 Episodic nodes
682 total Entity nodes
59 Recommendation nodes
16 Action nodes
3 Decision nodes
14 Challenge nodes
327 edges with non-null episodes metadata

Likely usable evaluation units

The cleanest unit for a first experiment is not a fully formalized decision episode, because the graph does not currently store explicit DecisionEpisode nodes. The most usable unit is a decision situation defined by:

a user query about a concrete SME problem
a small set of retrieved graph nodes and edges
one short-horizon business context where the graph contains at least a challenge, a suggested action, and some outcome or goal trace

Three task types already appear realistic with current assets.

Supplier MOQ pressure under cash-flow constraints
Weekend staffing or rota conflicts under service constraints
Sales decline recovery under attribution uncertainty

These are suitable because they already have:

owner-facing problem statements in chat sessions
graph entities and recommendations linked to those problems
at least partial outcome or goal traces in the graph

Major blockers

The assets are usable, but only for a narrow and explicitly provisional experiment.

No explicit DecisionEpisode layer currently exists in the graph.
- The pilot therefore approximates scenario-style context construction through richer relational retrieval and episode-linked context assembly.
Explicit evidence modeling is thin.
- SUPPORTS edges exist, but EVIDENCED_BY is effectively absent in the inspected sample.
Some retrieval noise is already visible.
- combined_rrf is usable, but still surfaces unrelated or weakly related facts.
- edge_episode is more episode-aware, but noisier than combined_rrf in the tested cases.
Outcomes are uneven.
- Some cases include achieved goals or benefit statements.
- Others only contain partial follow-through, not clean before/after business outcomes.
The dataset is synthetic or semi-simulated.
- That does not invalidate a feasibility pilot, but it limits external claims.

Confidence level on experiment feasibility

Medium.

The repository already supports a credible pilot for the narrow paper claim:

scenario-oriented context construction may produce more decision-grade context than fragment retrieval alone on SME-like decision tasks.

Confidence is not high because the graph still mixes clean local context with noisy retrieval artifacts, and because explicit outcome/evidence schema support is incomplete.

Recommendation

Proceed narrowly.

What is feasible now:

a small, inspectable comparison between node_rrf fragment-oriented retrieval and combined_rrf scenario-style context assembly
a rubric-based comparison of context quality on a few SME-like decision tasks

What is not yet feasible as a strong main experiment:

robust claims about business effectiveness
broad claims across SME domains
fully automated scoring of decision quality
a clean test of a true implemented scenario-memory layer

The current assets are strong enough for a first paper-facing pilot, but not strong enough for a definitive empirical validation.