> For the complete documentation index, see [llms.txt](https://docs.tensorx.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.tensorx.ai/developer-sdks/llamaindex.md).

# LlamaIndex

[LlamaIndex](https://www.llamaindex.ai/) is a data framework for LLM applications. Connect it to TensorX using the OpenAI-compatible configuration.

## Prerequisites

* Python 3.8+
* LlamaIndex installed
* TensorX API key from [app.tensorx.ai](https://app.tensorx.ai)

## Installation

```bash
pip install llama-index llama-index-llms-openai
```

## Configuration

### Basic Usage

```python
import os
from llama_index.llms.openai import OpenAI

# Set environment variables
os.environ["OPENAI_API_KEY"] = "your-tensorx-api-key"
os.environ["OPENAI_API_BASE"] = "https://api.tensorx.ai/v1"

# Create LLM instance
llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

# Simple completion
response = llm.complete("Explain quantum computing in simple terms")
print(response)
```

### Chat Interface

```python
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

messages = [
    ChatMessage(role="system", content="You are a helpful coding assistant"),
    ChatMessage(role="user", content="Write a Python function to calculate fibonacci numbers")
]

response = llm.chat(messages)
print(response)
```

### Streaming Responses

```python
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4o",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

# Streaming completion
response = llm.stream_complete("Write a short story about AI")
for chunk in response:
    print(chunk.delta, end="")
```

## RAG Applications

Build retrieval-augmented generation (RAG) systems with TensorX:

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure TensorX LLM
Settings.llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

# Configure embeddings (if using TensorX embeddings)
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

# Load documents
documents = SimpleDirectoryReader("./data").load_data()

# Create index
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What are the key points in these documents?")
print(response)
```

## Agent Applications

Create agents with TensorX:

```python
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI

# Define tools
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)

# Create agent
llm = OpenAI(
    model="gpt-4o",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1"
)

agent = ReActAgent.from_tools(
    [multiply_tool, add_tool],
    llm=llm,
    verbose=True
)

response = agent.chat("What is 20 plus 30, then multiplied by 2?")
print(response)
```

## Configuration with Settings

For global configuration across your application:

```python
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

# Set global LLM
Settings.llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1",
    temperature=0.7,
    max_tokens=4000
)
```

## Available Models

See [TensorX Models](/api-reference/models.md) for all available models.

| Model                        | Best For                 |
| ---------------------------- | ------------------------ |
| `claude-sonnet-4-20250514`   | Complex RAG, agents      |
| `claude-3-5-sonnet-20241022` | General LLM tasks        |
| `gpt-4o`                     | Multi-modal applications |
| `gpt-4o-mini`                | Cost-effective inference |

## Troubleshooting

### Connection errors

Ensure `api_base` ends with `/v1`:

```python
api_base="https://api.tensorx.ai/v1"  # Correct
api_base="https://api.tensorx.ai"     # Wrong
```

### Token limits

Adjust `max_tokens` based on your needs:

```python
llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorx-api-key",
    api_base="https://api.tensorx.ai/v1",
    max_tokens=8000  # Increase if needed
)
```

## Resources

* [LlamaIndex Documentation](https://docs.llamaindex.ai/)
* [LlamaIndex GitHub](https://github.com/run-llama/llama_index)
* [TensorX API Reference](/api-reference/overview.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorx.ai/developer-sdks/llamaindex.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
