Self-Discovering Documentation Systems

self-discovering documentation ai automation svelte5

Overview

Self-discovering documentation systems automatically identify, research, and integrate new topics without manual curation. They combine web search, AI content generation, and static site generation to create living documentation that evolves over time.

Core Concepts

1. Autonomous Research Loop

1
2
3
Define topics โ†’ Web search โ†’ Extract insights โ†’ Generate markdown โ†’ Build โ†’ Serve
     โ†‘                                                                          |
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Schedule (cron) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The system runs on a schedule (typically daily), researching predefined topics and updating content. New topics can be added by editing the research script’s topic list.

2. Topic Discovery

Topics can be discovered through:

  • Trending searches: What people are searching for in your domain
  • Gap analysis: Comparing existing content against a knowledge graph
  • User queries: Analyzing what users search for on the site
  • External signals: GitHub trending, HN top posts, arXiv papers

3. Content Evolution

Each topic page tracks its own evolution:

  • Last updated: Date of most recent research
  • Next review: Scheduled review date
  • Evolution notes: What changed and why

This creates an audit trail and prevents content from going stale.

4. Quality Signals

Automated quality checks:

  • Source attribution: Every claim should have a source
  • Freshness: Content older than N days gets flagged
  • Completeness: Pages with empty “Key Findings” are stubs
  • Tag validity: Tags must be arrays, not comma-separated strings

LLM-Wiki Implementation

The LLM-Wiki is a self-discovering docs site:

ComponentTechnologyRole
Static generatorHugo v0.140+Build HTML + JSON
Research enginePython + DuckDuckGo APIGather topic research
DashboardVanilla JS SPARender searchable cards
SchedulerHermes cronDaily evolution cycle
ServerNginx + Let’s EncryptServe with SSL

Data Flow

  1. research-automation.py searches for each topic
  2. Results โ†’ Hugo markdown with front matter
  3. hugo --minify โ†’ public/ (HTML + JSON)
  4. Dashboard fetches /topics/index.json
  5. Vanilla JS renders cards with search + tag filters

Svelte 5 Enhancement Layer

Add interactive discovery features via Svelte 5 components:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
<!-- components/TopicExplorer.svelte -->
<script>
  let topics = $state([]);
  let selectedTags = $state([]);
  let searchQuery = $state('');
  
  // Load from Hugo JSON API
  async function loadTopics() {
    const res = await fetch('/topics/index.json');
    topics = await res.json();
  }
  
  // Derived: all unique tags
  let allTags = $derived([
    ...new Set(topics.flatMap(t => t.tags || []))
  ].sort());
  
  // Derived: filtered topics
  let filtered = $derived(topics.filter(t => {
    const matchesSearch = !searchQuery || 
      t.title.toLowerCase().includes(searchQuery.toLowerCase()) ||
      (t.summary || '').toLowerCase().includes(searchQuery.toLowerCase()) ||
      (t.tags || []).some(tag => tag.toLowerCase().includes(searchQuery.toLowerCase()));
    
    const matchesTags = selectedTags.length === 0 || 
      selectedTags.every(tag => (t.tags || []).includes(tag));
    
    return matchesSearch && matchesTags;
  }));
</script>

<div class="explorer">
  <input 
    bind:value={searchQuery} 
    placeholder="Search topics..." 
    class="search-input"
  />
  
  <div class="tag-filters">
    {#each allTags as tag}
      <button 
        class:active={selectedTags.includes(tag)}
        onclick={() => {
          if (selectedTags.includes(tag)) {
            selectedTags = selectedTags.filter(t => t !== tag);
          } else {
            selectedTags = [...selectedTags, tag];
          }
        }}
      >
        {tag}
      </button>
    {/each}
  </div>
  
  <div class="topic-grid">
    {#each filtered as topic}
      <article class="topic-card">
        <h3><a href={topic.url}>{topic.title}</a></h3>
        <p>{topic.summary?.replace(/<[^>]+>/g, '')?.slice(0, 160)}...</p>
        <div class="tags">
          {#each topic.tags as tag}
            <span class="tag">{tag}</span>
          {/each}
        </div>
        <time>{topic.date}</time>
      </article>
    {/each}
  </div>
</div>

Key Svelte 5 patterns used:

  • $state for reactive UI state
  • $derived for computed filtered lists
  • {#each} for rendering lists
  • Event handlers with onclick

Anti-Patterns

  • Separate dashboard build: Don’t use Svelte/Vite for the dashboard โ€” adds Node.js dependency to deploy pipeline
  • Absolute URLs in JSON: .Permalink breaks when domain differs; always use .RelPermalink
  • Taxonomy overuse: Don’t declare topic = "topics" in taxonomies โ€” it breaks section JSON output
  • Global page lists: .Site.RegularPages in section templates shows unrelated content; use .Pages

Evolution Notes

Content last updated: 2026-06-05 Next review: 2026-06-12