Building a RAG System Without ML Embeddings

Overview

Retrieval-Augmented Generation (RAG) typically relies on ML embeddings to find semantically similar documents. But what if you need a RAG system that works offline, has zero dependencies, and gives you complete control over the search logic?

This project builds a lightweight RAG engine using keyword-based scoring that can search 160+ documents in milliseconds without any ML models.

Why Skip Embeddings?

ML Embeddings	Keyword-Based
Requires ML libraries (transformers, sentence-transformers)	Zero dependencies
Model loading takes seconds	Instant startup
Needs GPU for speed	Runs on any hardware
Black-box relevance	Transparent scoring
General-purpose	Domain-optimized

What We're Building

┌─────────────────────────────────────────────────────────────┐
│                    RAG Engine Architecture                  │
├─────────────────────────────────────────────────────────────┤
│  Document Loading   →  Keyword Extraction  →  Caching       │
│         ↓                                                   │
│  Query Expansion    →  Relevance Scoring   →  Response      │
└─────────────────────────────────────────────────────────────┘

Project Setup

Create a new Node.js project:

mkdir rag-engine && cd rag-engine
npm init -y

No dependencies needed! We use only Node.js built-ins.

File Structure

rag-engine/
├── rag-engine.js      # Core search logic
├── cli.js             # Interactive CLI
├── test-rag.js        # Test suite
└── docs/              # Your markdown documentation
    ├── CONTAINERS/
    ├── WINDOWS/
    ├── SECURITY/
    └── ...

Core Components

1. Document Loading with Caching

Load markdown files with a 5-minute cache:

const fs = require('fs').promises;
const path = require('path');
 
const CACHE_TTL = 5 * 60 * 1000; // 5 minutes
let documentCache = null;
let cacheTimestamp = 0;
 
// Document categories with priority (lower = higher priority)
const HOWTO_FOLDERS = [
  { name: 'CONTAINERS', category: 'docker', priority: 1 },
  { name: 'WINDOWS', category: 'windows', priority: 2 },
  { name: 'SECURITY', category: 'security', priority: 2 },
  { name: 'NETWORKING', category: 'networking', priority: 2 },
  { name: 'LINUX', category: 'linux', priority: 3 },
  { name: 'DEPLOYMENT', category: 'deployment', priority: 2 },
];
 
async function loadDocuments() {
  // Return cached if valid
  if (documentCache && Date.now() - cacheTimestamp < CACHE_TTL) {
    return documentCache;
  }
 
  const documents = [];
 
  for (const folder of HOWTO_FOLDERS) {
    const folderPath = path.join(DOCS_BASE, folder.name);
 
    try {
      const files = await fs.readdir(folderPath);
 
      for (const file of files) {
        if (!file.endsWith('.md')) continue;
 
        const filePath = path.join(folderPath, file);
        const content = await fs.readFile(filePath, 'utf8');
 
        documents.push({
          title: extractTitle(content, file),
          folder: folder.name,
          category: folder.category,
          priority: folder.priority,
          path: filePath,
          content,
          keywords: extractKeywords(content, folder.name),
          sections: extractSections(content),
        });
      }
    } catch (err) {
      // Folder doesn't exist - skip
    }
  }
 
  documentCache = documents;
  cacheTimestamp = Date.now();
  return documents;
}

2. Keyword Extraction

Extract domain-specific keywords from documents:

function extractKeywords(content, folderName) {
  const keywords = new Set();
 
  // Add folder name as keyword
  keywords.add(folderName.toLowerCase());
 
  // Domain-specific terms to look for
  const terms = [
    // Containers
    'container', 'docker', 'image', 'dockerfile',
    // Security
    'security', 'password', 'encryption', 'firewall',
    // Networking
    'network', 'dns', 'vpn', 'subnet', 'vlan',
    // Windows
    'windows', 'powershell', 'registry', 'group policy',
    // Linux
    'linux', 'bash', 'systemd', 'apt', 'yum',
    // General
    'backup', 'restore', 'error', 'troubleshoot',
  ];
 
  const lowerContent = content.toLowerCase();
 
  for (const term of terms) {
    if (lowerContent.includes(term)) {
      keywords.add(term);
    }
  }
 
  // Extract headers as keywords
  const headers = content.match(/^#{1,3}\s+(.+)$/gm) || [];
  for (const header of headers) {
    const text = header.replace(/^#+\s+/, '').toLowerCase();
    text.split(/\s+/).forEach(word => {
      if (word.length > 3) keywords.add(word);
    });
  }
 
  return Array.from(keywords);
}

3. Query Expansion

The secret sauce - map queries to related terms:

const KEYWORD_MAP = {
  // Container ecosystem
  'container': ['docker', 'image', 'dockerfile', 'compose'],
  'docker': ['container', 'image', 'dockerfile', 'compose'],
 
  // Security domain
  'security': ['firewall', 'encryption', 'password', 'edr'],
  'firewall': ['fortigate', 'policy', 'rule', 'vpn'],
  'antivirus': ['edr', 'threat', 'malware', 'sentinelone'],
 
  // Networking
  'network': ['dns', 'ip', 'subnet', 'vlan', 'vpn'],
  'vpn': ['ssl', 'ipsec', 'remote', 'tunnel'],
 
  // Windows
  'windows': ['powershell', 'registry', 'gpo', 'driver'],
  'driver': ['update', 'bios', 'firmware'],
 
  // Error handling
  'error': ['troubleshoot', 'fix', 'problem', 'failed'],
  'install': ['deploy', 'setup', 'configure'],
};
 
function expandQuery(query) {
  const queryLower = query.toLowerCase();
  const expanded = new Set(
    queryLower.split(/\s+/).filter(w => w.length > 2)
  );
 
  // Add related terms from keyword map
  for (const [key, related] of Object.entries(KEYWORD_MAP)) {
    if (queryLower.includes(key)) {
      related.forEach(term => expanded.add(term));
    }
  }
 
  return Array.from(expanded);
}

Example expansion:

Query: "docker container backup"
Expanded: ["docker", "container", "backup", "image", "dockerfile", "compose", "restore", "database", "recovery"]

The Scoring Algorithm

This is where the magic happens:

async function search(query, options = {}) {
  const { maxResults = 5, category = null } = options;
  const documents = await loadDocuments();
  const queryTerms = expandQuery(query);
  const queryLower = query.toLowerCase();
 
  const scored = documents
    .filter(doc => !category || doc.category === category)
    .map(doc => {
      let score = 0;
 
      // Keyword matches from expansion (+3 each)
      for (const keyword of doc.keywords) {
        if (queryTerms.includes(keyword)) score += 3;
        if (queryLower.includes(keyword)) score += 2;
      }
 
      // Query terms in content/title
      for (const term of queryTerms) {
        if (doc.content.toLowerCase().includes(term)) score += 1;
        if (doc.title.toLowerCase().includes(term)) score += 3;
      }
 
      // Contextual boosts
      if (queryLower.match(/error|problem|fix|failed|issue/)) {
        if (doc.title.toLowerCase().includes('troubleshoot')) {
          score += 5;
        }
      }
 
      if (queryLower.match(/deploy|install|setup|configure/)) {
        if (doc.title.toLowerCase().match(/deployment|install|setup/)) {
          score += 5;
        }
      }
 
      // Priority penalty (lower priority = higher number)
      score -= doc.priority * 0.5;
 
      return { doc, score };
    });
 
  return scored
    .filter(s => s.score > 0)
    .sort((a, b) => b.score - a.score)
    .slice(0, maxResults)
    .map(s => s.doc);
}

Scoring Breakdown

Factor	Points	Logic
Expanded keyword match	+3	Query term in doc keywords
Direct keyword in query	+2	Doc keyword found in query
Term in content	+1	Any match in body text
Term in title	+3	Match in document title
Troubleshoot boost	+5	Error query + troubleshoot doc
Deployment boost	+5	Install query + setup doc
Priority penalty	-0.5/level	Lower priority folders ranked lower

Response Generation

Build a formatted response with source attribution:

async function getResponse(query) {
  const docs = await search(query, { maxResults: 3 });
 
  if (docs.length === 0) {
    return {
      content: `No results found for: "${query}"`,
      sources: [],
      matched: false,
    };
  }
 
  const topDoc = docs[0];
 
  // Find best matching section
  let bestSection = findBestSection(topDoc, query);
 
  // Build response
  let response = `## ${topDoc.title}\n`;
  response += `*Source: ${topDoc.folder}*\n\n`;
 
  if (bestSection) {
    response += `### ${bestSection.title}\n\n`;
    response += bestSection.content.slice(0, 3000);
  } else {
    response += topDoc.content.slice(0, 3000);
  }
 
  // Add related documents
  if (docs.length > 1) {
    response += '\n\n---\n**Related:**\n';
    docs.slice(1).forEach(d => {
      response += `- ${d.title}\n`;
    });
  }
 
  return {
    content: response,
    sources: docs.map(d => ({
      title: d.title,
      folder: d.folder,
    })),
    matched: true,
  };
}

CLI Interface

Create an interactive command-line interface:

#!/usr/bin/env node
const readline = require('readline');
const { getResponse, getStats } = require('./rag-engine');
 
const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
});
 
async function main() {
  console.log('\n========================================');
  console.log('   OFFLINE RAG - Documentation Search');
  console.log('========================================\n');
 
  const stats = await getStats();
  console.log(`Loaded ${stats.total} documents\n`);
  console.log('Commands: /stats, /list, /quit\n');
 
  askQuestion();
}
 
function askQuestion() {
  rl.question('> ', async (input) => {
    const query = input.trim();
 
    if (!query) {
      askQuestion();
      return;
    }
 
    if (query === '/quit') {
      console.log('Goodbye!');
      rl.close();
      return;
    }
 
    if (query === '/stats') {
      const stats = await getStats();
      console.log(JSON.stringify(stats, null, 2));
      askQuestion();
      return;
    }
 
    // Search and respond
    console.log('\nSearching...\n');
    const response = await getResponse(query);
 
    console.log('─'.repeat(50));
    console.log(response.content);
    console.log('─'.repeat(50));
 
    if (response.sources.length > 0) {
      console.log('\nSources:');
      response.sources.forEach(s => {
        console.log(`  - ${s.title} [${s.folder}]`);
      });
    }
 
    console.log();
    askQuestion();
  });
}
 
main();

Usage Examples

Programmatic Usage

const { search, getResponse, getStats } = require('./rag-engine');
 
// Search for documents
const results = await search('docker container backup');
console.log('Found:', results.map(r => r.title));
 
// Get formatted response
const response = await getResponse('how to update BIOS');
console.log(response.content);
 
// Get statistics
const stats = await getStats();
console.log(`${stats.total} documents loaded`);

CLI Session

> docker container not starting

──────────────────────────────────────────────────
## HOWTO- Troubleshoot BC Container Issues

*Source: CONTAINERS*

### Common Startup Problems

1. **Port conflicts** - Check if ports 80/443 are in use
2. **Memory limits** - Containers need at least 4GB RAM
3. **License issues** - Verify license file is mounted

...

──────────────────────────────────────────────────

Sources:
  - HOWTO- Troubleshoot BC Container Issues [CONTAINERS]
  - HOWTO- Docker Container Backup [CONTAINERS]

Performance

With 161 documents (1.2MB total):

Metric	Value
Cold start	~50ms
Cached query	~5ms
Memory usage	~15MB
Cache TTL	5 minutes

When to Use This Approach

Good fit:

Air-gapped environments
Domain-specific documentation
Full control over search logic needed
Simple deployment (single file)
No ML infrastructure available

Consider embeddings instead:

General-purpose search
Semantic similarity needed
Large document corpus (10K+)
Multi-language support

Extending the System

Add Web Interface

Wrap the engine in an Express API:

const express = require('express');
const { search, getResponse } = require('./rag-engine');
 
const app = express();
app.use(express.json());
 
app.post('/api/search', async (req, res) => {
  const { query, maxResults = 5 } = req.body;
  const results = await search(query, { maxResults });
  res.json({ results });
});
 
app.post('/api/chat', async (req, res) => {
  const { query } = req.body;
  const response = await getResponse(query);
  res.json(response);
});
 
app.listen(3000);

Add Category Filtering

// Search only in security docs
const securityDocs = await search('firewall rules', {
  category: 'security',
  maxResults: 10
});

Export Chat History

function exportToMarkdown(history) {
  let md = '# Chat Export\n\n';
 
  for (const entry of history) {
    md += `## Q: ${entry.query}\n\n`;
    md += `${entry.response}\n\n`;
    md += `---\n\n`;
  }
 
  return md;
}

Next Steps