Getting Started with CaveauAI — Complete Guide

1 views

Welcome to CaveauAI

CaveauAI is Blue Note Logic's enterprise AI knowledge platform. It transforms your documents into a private, queryable AI knowledge base that delivers answers with full source citations. This guide walks you through everything you need to get started.


Step 1: Log In to the AI Platform

After subscribing through the billing portal, navigate to ai.bluenotelogic.com and log in with the same email and password you use for billing. Your accounts are linked via Single Sign-On.

Once logged in, you'll land on the Dashboard where you can see:

  • Query usage — How many AI queries you've used this month vs. your plan limit
  • Documents — Total documents in your private corpus
  • Knowledge chunks — The number of text segments indexed for search
  • Storage — Disk usage for uploaded files
  • Subscribed Packages — Shared knowledge bases you have access to (e.g., Norwegian Law, EU AI Act)

Step 2: Upload Your Documents

Navigate to Corpus in the left sidebar. This is where you manage your private knowledge base.

Uploading Files

  1. Click the upload area or drag and drop files directly
  2. Supported formats: PDF, DOCX, TXT, HTML, Markdown
  3. Maximum file size: 50 MB per file
  4. You can upload multiple files in a single batch

Adding Metadata

For each uploaded file, you can set:

  • Title — Descriptive name (auto-filled from filename)
  • Category — Organize documents by topic (contracts, policies, regulations, etc.)
  • Author — Document author for attribution
  • Tags — Comma-separated tags for filtering (e.g., "gdpr, compliance, 2024")
  • Source URL — Original web source if applicable
  • Publication Date — When the document was published

What Happens After Upload

  1. The platform extracts text from your file (PDF text extraction, DOCX parsing, etc.)
  2. The text is split into semantically meaningful chunks (~600 words each, heading-aware)
  3. Each chunk is embedded into a 768-dimensional vector using our nomic-embed-text model
  4. Vectors are stored in our Qdrant vector database for fast semantic search
  5. Status changes from pendingready (or error if extraction fails)

This entire process typically takes 10-30 seconds per document.


Step 3: Ask Questions with AI Chat

Navigate to Chat in the sidebar. This is where you interact with your knowledge base.

How RAG Chat Works

When you ask a question:

  1. Your question is embedded into a vector
  2. CaveauAI searches your private documents AND any subscribed shared corpuses
  3. The most relevant text passages are retrieved (with relevance scoring)
  4. These passages are sent to the AI model as context
  5. The model generates an answer citing specific sources
  6. You see both the answer and the source documents it used

Chat Features

  • Corpus Selector — Choose which knowledge bases to search (your private docs, subscribed packages, or all)
  • Model Selector — Choose between available models (plan-dependent)
  • Parameter Controls — Adjust Temperature (creativity), Top P (diversity), and Max Tokens (response length)
  • Source Panel — Click any source badge to see the full retrieved passage with relevance score
  • Feedback — Rate answers with thumbs up/down to help improve quality
  • Streaming — Responses arrive in real-time, token by token

Tips for Better Results

  • Be specific — "What are the GDPR requirements for data processing agreements?" beats "Tell me about GDPR"
  • Use Norwegian — The bnl-legal model excels with Norwegian legal questions
  • Filter by corpus — If you know which knowledge base has the answer, select only that one
  • Check sources — Always verify important claims by clicking through to source documents

Step 4: Subscribe to Shared Knowledge Packages

Under Corpus, you'll find the Knowledge Packages section showing available shared corpuses:

Package Contents Documents
Norwegian Law Lovdata regulations, KOFA decisions, court rulings, family law 2,000+
EU AI Act AI Act text, DPA decisions, EDPB guidance, national implementations 1,500+
CO2 & Climate EU ETS, Paris Agreement, Nordic climate laws, ESG frameworks 1,000+
Telecom Regulation BEREC guidelines, ETSI standards, Nkom decisions, spectrum policy 800+
Employment Law Arbeidsmiljøloven, collective agreements, Arbeidstilsynet guidance 1,200+

Subscribed packages are automatically included in your chat searches. Your private documents are boosted 1.5x in relevance over shared content to prioritize your own knowledge.


Step 5: Set Up Web Scrapers (Professional+)

Navigate to Pipeline to configure automated web scraping that keeps your knowledge base current.

  1. Click Create Source
  2. Enter the URL to monitor
  3. Choose a schedule: Manual, Daily, or Weekly
  4. Set depth (how many links to follow) and page limit
  5. Submit for approval — our team reviews sources within 24 hours

Once approved, the scraper runs on schedule and automatically ingests new content into your corpus.


Need Help?