Getting Started with NLQL¶
This guide will help you get started with NLQL, from installation to executing your first queries.
Installation¶
Basic Installation¶
Install NLQL with pip:
This installs the core NLQL engine with the Lark parser. For semantic search capabilities, you'll also need an embedding provider.
With Embedding Support¶
For semantic similarity operations (SIMILAR_TO), install with text support:
This includes sentence-transformers for default embedding functionality.
With Vector Database Adapters¶
Install with specific vector database support:
# ChromaDB
pip install python-nlql[chroma]
# FAISS
pip install python-nlql[faiss]
# Qdrant
pip install python-nlql[qdrant]
# All adapters
pip install python-nlql[all]
Your First Query¶
Using In-Memory Data¶
The simplest way to get started is with the built-in MemoryAdapter:
from nlql import NLQL
from nlql.adapters import MemoryAdapter
# Create adapter and add data
adapter = MemoryAdapter()
adapter.add_text("AI agents are autonomous systems", {"topic": "AI"})
adapter.add_text("Machine learning powers modern AI", {"topic": "ML"})
adapter.add_text("Natural language processing", {"topic": "NLP"})
# Initialize NLQL with explicit adapter
nlql = NLQL(adapter=adapter)
# Execute a simple query
results = nlql.execute("SELECT CHUNK LIMIT 2")
# Print results
for result in results:
print(result.content)
Using a Vector Database¶
With ChromaDB (requires ChromaAdapter - coming soon):
import chromadb
from nlql import NLQL
from nlql.adapters import ChromaAdapter # Coming soon
# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection("my_docs")
# Add documents
collection.add(
documents=["AI agents are autonomous", "ML powers modern AI"],
ids=["doc1", "doc2"],
)
# Create adapter and initialize NLQL
adapter = ChromaAdapter(collection)
nlql = NLQL(adapter=adapter)
# Query with semantic search
results = nlql.execute("""
SELECT CHUNK
WHERE SIMILAR_TO("artificial intelligence")
LIMIT 5
""")
Basic Query Syntax¶
SELECT Clause¶
Choose the granularity of results:
-- Full documents
SELECT DOCUMENT
-- Chunks (default from vector DBs)
SELECT CHUNK
-- Individual sentences
SELECT SENTENCE
-- Sliding window with context
SELECT SPAN(SENTENCE, window=3)
WHERE Clause¶
Filter results with various operators:
-- Semantic similarity
WHERE SIMILAR_TO("AI agents") > 0.8
-- Text matching
WHERE CONTAINS("machine learning")
-- Metadata filtering
WHERE META("date") > "2024-01-01"
-- Combine conditions
WHERE SIMILAR_TO("AI") > 0.7 AND META("topic") == "ML"
ORDER BY and LIMIT¶
-- Order by similarity score
ORDER BY SIMILARITY DESC
-- Order by metadata field
ORDER BY META("date") DESC
-- Limit results
LIMIT 10
Complete Example¶
from nlql import NLQL
from nlql.adapters import MemoryAdapter
# Create adapter
adapter = MemoryAdapter()
# Add documents with metadata
adapter.add_text(
"AI agents can perceive their environment and take actions.",
{"date": "2024-01-15", "author": "Alice", "topic": "AI"}
)
adapter.add_text(
"Machine learning models learn from data without explicit programming.",
{"date": "2024-01-20", "author": "Bob", "topic": "ML"}
)
adapter.add_text(
"Natural language processing enables computers to understand human language.",
{"date": "2024-01-25", "author": "Alice", "topic": "NLP"}
)
# Or use batch add
texts = [
"AI agents can perceive their environment and take actions.",
"Machine learning models learn from data without explicit programming.",
"Natural language processing enables computers to understand human language.",
]
metadatas = [
{"date": "2024-01-15", "author": "Alice", "topic": "AI"},
{"date": "2024-01-20", "author": "Bob", "topic": "ML"},
{"date": "2024-01-25", "author": "Alice", "topic": "NLP"},
]
adapter.add_texts(texts, metadatas)
# Initialize NLQL
nlql = NLQL(adapter=adapter)
# Execute a complex query
results = nlql.execute("""
SELECT CHUNK
WHERE META("author") == "Alice"
LIMIT 5
""")
# Process results
for i, result in enumerate(results, 1):
print(f"\n--- Result {i} ---")
print(f"Content: {result.content}")
print(f"Metadata: {result.metadata}")
Semantic Search Example¶
NLQL supports semantic search using the SIMILAR_TO operator, which uses vector embeddings to find semantically similar content:
from nlql import NLQL
from nlql.adapters import MemoryAdapter
# Create adapter with AI-related content
adapter = MemoryAdapter()
adapter.add_text(
"Artificial intelligence and machine learning are revolutionizing technology.",
{"category": "AI", "author": "Alice", "year": 2024}
)
adapter.add_text(
"Neural networks form the foundation of modern deep learning systems.",
{"category": "ML", "author": "Bob", "year": 2024}
)
adapter.add_text(
"Natural language processing enables computers to understand human language.",
{"category": "NLP", "author": "Alice", "year": 2023}
)
nlql = NLQL(adapter=adapter)
# Semantic search query
results = nlql.execute("""
SELECT CHUNK
WHERE SIMILAR_TO("deep learning and neural networks") > 0.5
ORDER BY SIMILARITY DESC
LIMIT 3
""")
# Results are ordered by semantic similarity
for i, result in enumerate(results, 1):
similarity = result.metadata['similarity']
print(f"{i}. [{similarity:.3f}] {result.content}")
print(f" Category: {result.metadata['category']}\n")
Output:
1. [0.814] Neural networks form the foundation of modern deep learning systems.
Category: ML
2. [0.777] Artificial intelligence and machine learning are revolutionizing technology.
Category: AI
3. [0.609] Natural language processing enables computers to understand human language.
Category: NLP
How Semantic Search Works¶
- Automatic Vectorization: When you use
SIMILAR_TO("query"), NLQL automatically: - Embeds the query text using the default model (
all-MiniLM-L6-v2) - Embeds all text chunks in your data
-
Computes cosine similarity scores
-
Similarity Scores: The similarity score (0-1) is stored in
metadata["similarity"]and can be: - Used in WHERE clause:
WHERE SIMILAR_TO("query") > 0.8 - Used in ORDER BY:
ORDER BY SIMILARITY DESC -
Accessed in results:
result.metadata['similarity'] -
Hybrid Queries: Combine semantic search with metadata filtering:
results = nlql.execute("""
SELECT CHUNK
WHERE
SIMILAR_TO("AI technology") > 0.6
AND META("year") == 2024
AND META("author") == "Alice"
ORDER BY SIMILARITY DESC
""")
Installation for Semantic Search¶
To use semantic search, install with text support:
This installs sentence-transformers for the default embedding provider.
Extensibility (Optional)¶
NLQL is highly extensible. You can customize functions, operators, and embedding providers to fit your specific needs:
from nlql import register_function, register_operator
# Add custom function for WHERE/ORDER BY
@register_function("word_count")
def word_count(text: str) -> int:
return len(text.split())
# Add domain-specific operator
@register_operator("HAS_EMAIL")
def has_email(text: str) -> bool:
import re
return bool(re.search(r'[\w\.-]+@[\w\.-]+', text))
# Use in queries
results = nlql.execute("""
SELECT CHUNK
WHERE word_count(content) > 50 AND HAS_EMAIL(content)
""")
What you can extend:
- 🔧 Custom Functions: Add reusable logic for WHERE and ORDER BY clauses
- 🎯 Custom Operators: Create domain-specific operators (e.g., HAS_EMAIL, REGEX)
- 🤖 Embedding Providers: Use your own embedding models (OpenAI, Cohere, etc.)
- 🏢 Instance-Level Registration: Different NLQL instances can have different implementations
📚 Learn More: See the Extensibility Guide for complete documentation and examples. Check the examples/ directory in the repository for runnable code samples.
Next Steps¶
- Learn about Query Syntax in detail
- Explore Data Sources and adapters
- Discover Extensibility options for advanced customization
- Check the API Reference