LlamaIndex (originally GPT Index) was founded in 2022 by Jerry Liu in San Francisco. The framework focuses specifically on the data ingestion and retrieval side of LLM applications, making it easy to index, structure, and query private data with language models.
The company raised $19 million in a Series A round in 2024, building on earlier seed funding. Investors include 8VC and other prominent venture firms. The project has attracted over 30,000 GitHub stars and millions of pip installs.
LlamaIndex provides connectors for over 160 data sources — from PDFs and databases to Slack, Google Drive, and Notion. Its indexing engine structures this data into formats optimized for LLM retrieval, supporting vector indices, keyword indices, knowledge graphs, and tree-based structures.
The company’s managed product, LlamaCloud, offers enterprise-grade parsing and retrieval services, including LlamaParse — a document parsing API that handles complex layouts, tables, and charts that simpler PDF extractors miss.
LlamaIndex is often used alongside LangChain, handling the data layer while LangChain manages orchestration. The framework’s tight focus on data problems like chunking strategies, embedding pipelines, and hybrid search has made it the go-to tool for teams building RAG applications. The company has around 40 employees.