Query and Data Invalidation Architecture
Last Updated: 2026-02-12
This document describes the frontend's data fetching and cache invalidation architecture using TanStack Query (React Query).
Table of Contents
- Overview
- Core Principles
- Query Hooks
- Centralized Invalidation
- Usage Patterns
- Best Practices
- Common Pitfalls
- Debugging
Overview
Nagelfluh uses TanStack Query v4 for: - Server state management - Automatic background refetching - Cache management - Optimistic updates - Request deduplication
Architecture: All data fetching uses custom hooks that wrap TanStack Query. All cache invalidation goes through centralized helpers in ProcessContext.
Core Principles
1. Use Hooks Everywhere
❌ Never do this:
fetch(`${API}/datasets?search=${search}`)
.then(r => r.json())
.then(data => setDatasets(data));
✅ Always do this:
const { data: datasets = [] } = useSearchDatasets(search, true, projectId);
Why: Hooks connect components to the query cache. Manual fetch() calls are invisible to the cache system and won't update when data changes.
2. Centralized Invalidation Only
❌ Never do this:
queryClient.invalidateQueries({ queryKey: ['processes'] });
// or
queryClient.refetchQueries({ queryKey: ['processes'] });
✅ Always do this:
const { invalidateProject } = useContext(ProcessContext);
await invalidateProject(projectId);
Why: Centralized invalidation ensures all related queries are invalidated together, preventing race conditions and partial updates.
3. Trust TanStack Query
Don't implement: - Polling loops to wait for data - Manual cache coordination - Complex refetch orchestration
TanStack Query handles all of this automatically when you use hooks and invalidate through the centralized helpers.
Query Hooks
All query hooks are defined in frontend/src/datamodel/useQueries.js.
Available Hooks
Projects and Environments
// Fetch all projects
const { data: projects = [], isLoading, error } = useProjects();
// Fetch all environments
const { data: environments = [], isLoading, error } = useEnvironments();
// Fetch process types for an environment
const { data: types = {}, isLoading } = useEnvironmentProcessTypes(environmentId);
Processes
// Fetch all processes for a project
const { data: processes = [], isLoading, error, refetch } = useProcesses(projectId);
// Note: refetch is available but prefer using invalidation helpers
Datasets
// Fetch a single dataset by ID
const { data: dataset, isLoading, error } = useDataset(datasetId);
// Search datasets with filters
const { data: datasets = [], isLoading, error } = useSearchDatasets(
searchText, // Search string
completedOnly, // Boolean: only completed processes
projectId // Filter by project
);
// Fetch outputs for a specific process version
const { data: datasets = [], isLoading } = useProcessOutputDatasets(
process, // Process object
version // Version number
);
Mutations
// Create a new project
const createProject = useCreateProject();
await createProject.mutateAsync({ name: "My Project" });
// Create a new environment
const createEnvironment = useCreateEnvironment();
await createEnvironment.mutateAsync({ name: "My Env", image: "..." });
// Create a new process (does NOT auto-invalidate)
const createProcess = useCreateProcess();
const newProcess = await createProcess.mutateAsync({
proc: { name, type, params, ... },
projectId
});
// Must manually invalidate after:
await invalidateProject(projectId);
Query Keys
Query keys are centralized in queryKeys object:
import { queryKeys } from './datamodel/useQueries';
queryKeys.projects // ['projects']
queryKeys.environments // ['environments']
queryKeys.environmentProcessTypes(envId) // ['environmentProcessTypes', envId]
queryKeys.processes(projectId) // ['processes', projectId]
queryKeys.dataset(id) // ['dataset', id]
queryKeys.datasets(search, completedOnly, projectId) // ['datasets', { ... }]
queryKeys.processOutputDatasets(processId, version) // ['processOutputDatasets', processId, version]
Note: You rarely need these directly. Use the invalidation helpers instead.
Centralized Invalidation
All query invalidation MUST go through helpers provided by ProcessContext.
Invalidation Helpers
const {
invalidateProcess,
invalidateProject,
invalidateDatasets
} = useContext(ProcessContext);
invalidateProcess(processId, projectId)
Invalidates a specific process and its outputs.
Refetches:
- ['processes', projectId]
- ['processOutputDatasets', processId] (all versions)
Use when: A specific process has been updated (parameters changed, new version created).
// Example: After updating process parameters
await createProcess.mutateAsync({ proc: { id: processId, ... }, projectId });
await invalidateProcess(processId, projectId);
invalidateProject(projectId)
Invalidates all data for a project.
Refetches:
- ['processes', projectId] - All processes
- ['datasets'] - All dataset searches
- ['processOutputDatasets'] - All process outputs (via predicate)
Use when: - Creating a new process - Deleting a process - Any operation that affects multiple processes - WebSocket update received (process state changed)
// Example: After creating a new process
const newProcess = await createProcess.mutateAsync({ proc, projectId });
await invalidateProject(projectId);
setActiveProcess({ processId: newProcess.id, version: 1 });
invalidateDatasets()
Invalidates dataset searches only.
Refetches:
- ['datasets'] - All dataset searches
Use when: Dataset metadata has changed but processes haven't (rare).
// Example: After dataset-only operation
await someDatasetOperation();
await invalidateDatasets();
Automatic Invalidation
WebSocket Updates: When the backend sends a process state update via WebSocket, ProcessContext automatically calls invalidateProject(). You don't need to handle this manually.
Location: frontend/src/ProcessContext.js:210-213
const handleWebSocketMessage = useCallback(async (update) => {
await invalidateHelpers.invalidateProject();
}, [invalidateHelpers]);
Usage Patterns
Pattern 1: Creating a Process
import { useContext } from 'react';
import { ProcessContext } from '../ProcessContext';
import { useCreateProcess } from '../datamodel/useQueries';
function MyComponent() {
const { currentProject, invalidateProject, setActiveProcess } = useContext(ProcessContext);
const createProcess = useCreateProcess();
const handleCreate = async () => {
// 1. Create the process
const newProcess = await createProcess.mutateAsync({
proc: {
name: "my-process",
type: "fft",
params: { ... },
environment_id: envId
},
projectId: currentProject
});
// 2. Invalidate to update all components
await invalidateProject(currentProject);
// 3. Navigate to the new process
setActiveProcess({
processId: newProcess.id,
version: 1
});
};
return <button onClick={handleCreate}>Create Process</button>;
}
Pattern 2: Searching Datasets
import { useState, useEffect, useRef } from 'react';
import { useSearchDatasets } from '../datamodel/useQueries';
function DatasetSearch({ projectId }) {
const [searchText, setSearchText] = useState('');
const [debouncedSearch, setDebouncedSearch] = useState('');
const debounceTimer = useRef(null);
// Debounce search input
useEffect(() => {
if (debounceTimer.current) clearTimeout(debounceTimer.current);
debounceTimer.current = setTimeout(() => {
setDebouncedSearch(searchText);
}, 300);
return () => {
if (debounceTimer.current) clearTimeout(debounceTimer.current);
};
}, [searchText]);
// Use the hook with debounced search
const { data: datasets = [], isLoading } = useSearchDatasets(
debouncedSearch,
true, // completedOnly
projectId
);
return (
<div>
<input value={searchText} onChange={e => setSearchText(e.target.value)} />
{isLoading ? <div>Loading...</div> : (
<ul>
{datasets.map(ds => <li key={ds.id}>{ds.dataset_name}</li>)}
</ul>
)}
</div>
);
}
Key: The component automatically updates when invalidateProject() is called elsewhere, because the hook is connected to the query cache.
Pattern 3: Displaying Process Outputs
import { useContext } from 'react';
import { ProcessContext } from '../ProcessContext';
import { useProcessOutputDatasets } from '../datamodel/useQueries';
function ProcessOutputs() {
const { processes, activeProcess } = useContext(ProcessContext);
// Find the active process object
const process = activeProcess
? processes.find(p => p.id === activeProcess.processId)
: null;
// Fetch outputs for active version
const { data: datasets = [], isLoading } = useProcessOutputDatasets(
process,
activeProcess?.version
);
if (!process) return <div>No process selected</div>;
if (isLoading) return <div>Loading outputs...</div>;
return (
<ul>
{datasets.map(ds => (
<li key={ds.id}>{ds.dataset_name}: {ds.url}</li>
))}
</ul>
);
}
Pattern 4: Updating Process Parameters
import { useContext } from 'react';
import { ProcessContext } from '../ProcessContext';
import { useCreateProcess } from '../datamodel/useQueries';
function ProcessParameterEditor({ process }) {
const { currentProject, invalidateProject, setActiveProcess } = useContext(ProcessContext);
const createProcess = useCreateProcess();
const handleSave = async (newParams) => {
// Creating with existing ID creates a new version
const updatedProcess = await createProcess.mutateAsync({
proc: {
id: process.id, // Same ID = new version
name: process.name,
type: process.type,
environment_id: process.environment_id,
params: newParams
},
projectId: currentProject
});
// Invalidate to show the new version
await invalidateProject(currentProject);
// Switch to the new version
const latestVersion = Math.max(...updatedProcess.versions.map(v => v.version));
setActiveProcess({
processId: process.id,
version: latestVersion
});
};
return <ParameterForm onSave={handleSave} />;
}
Best Practices
1. Always await invalidation
// ✅ Good: Wait for refetch to complete
await invalidateProject(projectId);
setActiveProcess({ processId: newId, version: 1 });
// ❌ Bad: Race condition - activeProcess set before data arrives
invalidateProject(projectId); // Don't await
setActiveProcess({ processId: newId, version: 1 });
2. Use staleTime appropriately
Current defaults (in useQueries.js):
- Projects: 5 minutes
- Environments: 5 minutes
- Process types: 5 minutes
- Processes: 10 seconds
- Datasets: 10 seconds
- Single dataset: 30 seconds
Why short staleTime for processes/datasets: These change frequently via user actions and WebSocket updates.
3. Enable queries conditionally
// ✅ Good: Only fetch when needed
const { data: datasets } = useProcessOutputDatasets(
process,
version,
{ enabled: !!process && version != null }
);
// ❌ Bad: Hook errors if process is null
const { data: datasets } = useProcessOutputDatasets(process, version);
Most hooks already handle this, but be aware when passing options.
4. Use empty arrays for default values
// ✅ Good: Prevents undefined errors
const { data: processes = [] } = useProcesses(projectId);
processes.map(p => ...); // Safe even during loading
// ❌ Bad: Errors during loading
const { data: processes } = useProcesses(projectId);
processes.map(p => ...); // Error: Cannot read property 'map' of undefined
5. Don't mix manual fetch with hooks
// ❌ Bad: Component won't update when cache invalidates
useEffect(() => {
fetch(`${API}/datasets?search=${search}`)
.then(r => r.json())
.then(setDatasets);
}, [search]);
// ✅ Good: Automatically updates
const { data: datasets = [] } = useSearchDatasets(search, true, projectId);
Common Pitfalls
Pitfall 1: Forgetting to invalidate after mutations
// ❌ Bad: UI won't update
const newProcess = await createProcess.mutateAsync({ proc, projectId });
setActiveProcess({ processId: newProcess.id, version: 1 });
// Where's the invalidation?
// ✅ Good: UI updates immediately
const newProcess = await createProcess.mutateAsync({ proc, projectId });
await invalidateProject(projectId);
setActiveProcess({ processId: newProcess.id, version: 1 });
Pitfall 2: Using wrong invalidation helper
// ❌ Overkill: Invalidates everything when only one process changed
await invalidateProject(projectId);
// ✅ Better: Only invalidate what changed
await invalidateProcess(processId, projectId);
However, when in doubt, use invalidateProject(). It's better to refetch too much than too little.
Pitfall 3: Not awaiting async invalidation
// ❌ Bad: Race condition
invalidateProject(projectId);
console.log('Processes:', processes); // Still old data!
// ✅ Good: Wait for refetch
await invalidateProject(projectId);
console.log('Processes:', processes); // New data available
Note: Even after awaiting, the processes variable won't update immediately in the current render. It will update in the next render cycle. But awaiting ensures the data is in the cache.
Pitfall 4: Invalidating in render
// ❌ Bad: Causes infinite loop
function MyComponent() {
const { invalidateProject } = useContext(ProcessContext);
invalidateProject(); // Called every render!
return <div>...</div>;
}
// ✅ Good: Invalidate in effect or handler
function MyComponent() {
const { invalidateProject } = useContext(ProcessContext);
useEffect(() => {
invalidateProject();
}, []); // Only once
return <div>...</div>;
}
Pitfall 5: Duplicate invalidation
// ❌ Bad: Mutation already invalidates, this is redundant
const createProject = useCreateProject(); // This auto-invalidates
await createProject.mutateAsync({ name });
await invalidateProjects(); // Unnecessary!
// Note: useCreateProcess does NOT auto-invalidate (by design)
// You must manually invalidate for processes
const createProcess = useCreateProcess(); // This does NOT auto-invalidate
await createProcess.mutateAsync({ proc, projectId });
await invalidateProject(projectId); // Required!
Debugging
Inspecting the Cache
Use React Query DevTools (if installed):
import { ReactQueryDevtools } from '@tanstack/react-query-devtools';
function App() {
return (
<>
{/* Your app */}
<ReactQueryDevtools initialIsOpen={false} />
</>
);
}
Logging Query State
const { data, isLoading, isFetching, isStale } = useProcesses(projectId);
console.log({
data,
isLoading, // True during first load
isFetching, // True during any fetch (including refetch)
isStale // True if data is stale (past staleTime)
});
Manual Cache Inspection
import { useQueryClient } from '@tanstack/react-query';
const queryClient = useQueryClient();
// Get current cache state
const processesQuery = queryClient.getQueryState(['processes', projectId]);
console.log('Status:', processesQuery?.status);
console.log('Data:', processesQuery?.data);
// Get cached data directly
const cachedProcesses = queryClient.getQueryData(['processes', projectId]);
console.log('Cached processes:', cachedProcesses);
// See all queries in cache
const allQueries = queryClient.getQueryCache().getAll();
console.log('All queries:', allQueries.map(q => q.queryKey));
Debugging Invalidation
Add temporary logging to invalidation helpers:
// In ProcessContext.js (temporary debugging)
invalidateProject: async (projectId = currentProject) => {
console.log('[invalidateProject] Starting for project:', projectId);
await Promise.all([
queryClient.refetchQueries({
queryKey: ['processes', projectId],
type: 'active'
}).then(() => console.log('[invalidateProject] Processes refetched')),
queryClient.refetchQueries({
queryKey: ['datasets'],
type: 'active'
}).then(() => console.log('[invalidateProject] Datasets refetched')),
// ... etc
]);
console.log('[invalidateProject] Complete');
}
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ Backend API │
└─────────────────────────────────────────────────────────────┘
▲
│ HTTP + WebSocket
▼
┌─────────────────────────────────────────────────────────────┐
│ TanStack Query Cache │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Processes │ │ Datasets │ │ Outputs │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
▲
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ FlowView │ │ PlotView │ │ Export │
│ │ │ │ │ │
│ uses │ │ uses │ │ uses │
│ hooks │ │ hooks │ │ hooks │
└──────────┘ └──────────┘ └──────────┘
│ │ │
└─────────────────┼─────────────────┘
│
▼
┌──────────────────┐
│ ProcessContext │
│ │
│ Invalidation │
│ Helpers │
│ │
│ - invalidate │
│ Process() │
│ - invalidate │
│ Project() │
│ - invalidate │
│ Datasets() │
└──────────────────┘
▲
│
WebSocket Updates
Mutations
User Actions
Related Documentation
- Widget System - How widgets consume query data
- JSON Schema Forms - Dataset selector integration
- Layout System - Layout state vs query state
- System Overview - Backend API structure
Migration Guide
If you have old code using manual fetch():
Before (Manual Fetch)
const [datasets, setDatasets] = useState([]);
const [loading, setLoading] = useState(false);
useEffect(() => {
setLoading(true);
fetch(`${API}/datasets?search=${search}&project_id=${projectId}`)
.then(r => r.json())
.then(data => {
setDatasets(data);
setLoading(false);
});
}, [search, projectId]);
After (TanStack Query Hook)
const { data: datasets = [], isLoading } = useSearchDatasets(
search,
true,
projectId
);
Benefits: - ✅ Automatically refetches when invalidated - ✅ Automatic caching and deduplication - ✅ No manual loading state management - ✅ Built-in error handling - ✅ Background refetching - ✅ Stale-while-revalidate behavior
Summary
The Three Rules:
- Use hooks for all data fetching - Never use manual
fetch()for server data - Invalidate through ProcessContext helpers only - Never call
queryClient.invalidateQueries()directly - Trust TanStack Query - Don't implement your own polling, caching, or coordination
Following these rules ensures deterministic, race-condition-free data propagation throughout the application.