Introduction to Corvic

What is Corvic

Corvic is a no-code agentic AI platform designed to prepare structured, semi-structured, and unstructured private enterprise data for AI applications. It creates multi-modal embeddings from multi-structural data that allow you to interact with your data, unlocking deep insights and advanced analytics, optimizing it for further AI utilization.

As an end-to-end solution, Corvic serves as a co-pilot for data and domain experts by integrating advanced tools for data transformation, embedding generation, and embedding spaces management into one cohesive platform. This robust approach enables sophisticated applications in both Generative AI(information retrieval) and Predictive AI (feature engineering) domains, surpassing the capabilities of traditional LLM-based methods used in semantic search.

Core Principles

Improved Productivity: Corvic streamlines data transformation and embedding generation at scale, eliminating the need for you to develop a production-ready embedding generation pipeline solution. With Corvic, you can start your first data insight spaces in just minutes or hours, and build production-ready embedding pipelines in a matter of days.
Explainable Quality: Corvic leverages state-of-art data transformation as well as dedicated GenAI and GraphAI algorithms to produce high-quality signals from data. Users can benefit from realtime metrics, visualization dashboards, and continuous monitoring to obtain transparent insights into model predictions and ensure about the effectiveness of their embeddings at any time.
Performance & Efficiency: Corvic relies on distributed high-performance computation and optimized proprietary implementations of its embedding algorithms that are 10x-100x faster than alternative implementations including open-source software. Corvic’s embedding algorithms do not necessarily depend on exotic hardware and excessive GPU usage.
Security & Privacy: Corvic offers data protection through best practices of secure development and state-of-the-art cryptography. Corvic leverages top-tier privacy measures and compliance standards to ensure the privacy of its customers.

The Architecture

The Corvic Platform supports the ingestion of multi-structured data. It generates and manages multi-space embeddings, and stores these embeddings. They can then be accessed via vector search and retrieval APIs. The platform also supports agent creation and configuration that leverage the multi-space embedding to generate dynamic policies providing deeper insight from the ingested data.

Platform Components

Organization

Organization in Corvic Platform refers to the overall structure that manages users, roles, and data rooms. Each organization’s data is isolated from others and received dedicated workload execution. As an admin, you can have full access to the data rooms and its functionalities.

Data Room

Data Room is an exclusive workspace where you can upload and manage your ingestion pipelines, generate feature views, run spaces and create agents. Each data room is isolated, meaning the data, features, spaces and agents in one room are not accessible from another.

Ingestion Pipeline

An “Ingestion pipeline” collects and transforms raw data that you upload to a data room for further processing. This raw data can be in the form of text documents, tables, or databases. It's important to ensure the data is clean and well-structured to maximize the effectiveness of the platform's transformation and embedding tools.

Feature View

A "Feature View" allows you to specify “what” entity or object you want to embed. More specifically, it enables you to define the pre-image of your desired embedding space using contributing input sources and entities represented by the space. For instance, suppose you have uploaded two parquet files: “accounts.parquet” (Account) - list of financial accounts, and “transactions.parquet” (Transaction) - list of financial transactions between pairs of accounts. You can create a feature view to represent your account IDs, by using both Account and Transaction as input sources and introducing Account.account_id as the space key.

💡

A Feature View specifies what needs be converted to embeddings, but it doesn't explain how it should be embedded.

Space

A "Space" refers to the process of transforming your defined feature views into embeddings. Here, you specify the "how" - the method of embedding. This involves selecting an algorithm, setting parameters, and initiating the space. Once the space is complete, your embeddings are ready for analysis and export.

💡

A Space details how to use input sources in your Feature View to generate embeddings for each key defined by the output entities within that View.

You can analyze a space using the platform visualization dashboard as well as normalized quality metrics such as Stable Rank, Reciprocal Condition Number, and NE Sum.

Agent

An "Agent" refers to a natural language processing entity that utilizes an LLM-powered orchestration mechanism to execute multi-space traversal workflows tailored to user's inquiries. Each agent is configured with a set of input spaces (along with specific instructions for using those spaces) and parameters for the completion LLM.

💡

An Agent plans how to leverage multiple embedding spaces to derive enterprise insights for user queries.

The agent combines information from multiple embedding spaces to generate enterprise insights, leveraging the LLM for enhanced reasoning. This process involves orchestrating pre-configured spaces and the LLM to execute policies that address complex queries across various data spaces.