Skip to main content

Enigma Data

Enigma's data platform delivers comprehensive intelligence on every U.S. business — from sole proprietorships to multinationals — powered by graph-model-1, a knowledge graph with 2.4 billion+ nodes mapping the U.S. business landscape.

Jump to:


Data Model

What exactly is a business? It's not as simple as it seems. Beneath familiar names and storefronts lies a web of identities, relationships, and operations that defy simple definitions:

  • What is a business's name? (Legal names, trade names, and more.)
  • How do franchises or multi-brand corporations fit into a single definition?
  • What about businesses sharing addresses or operating multiple locations?

Enigma's data model is built upon four Core Entity Types:

  • Brands: How a business presents itself to customers.
  • Legal Entities: How a business is recognized by the government.
  • Operating Locations: Sites where a business conducts its activities.
  • Persons: The individuals linked to a business—owners, officers, registered agents, and other key personnel.

These are connected to one another through multiple Relationships. Entities and relationships are created when Enigma observes activity from the business based on real-world records.

Brands

A brand is the face of a business as seen by its customers. It includes trade names, logos, and marketing identities. Brands may:

  • Operate across multiple locations.
  • Be owned by one or more legal entities.
  • Coexist with other brands under a shared corporate structure.

Example: Starbucks is a global brand known for coffee shops. Each store represents the same brand but is tied to distinct operating locations and potentially different legal entities in each country.

Legal entities are how businesses are recognized by governments and regulatory bodies. They are the backbone for taxation, compliance, and legal accountability. A single legal entity can own multiple brands or operate many locations.

Example: Starbucks Corporation is the legal entity behind the Starbucks brand. It ensures compliance with laws, pays taxes, and manages corporate governance.

Operating Locations

Operating locations represent the physical or virtual spaces where a business interacts with customers or conducts activities. Locations connect brands to legal entities, grounding abstract concepts in specific places.

Example: A single Starbucks store at 123 Main St. is an operating location. It's tied to the Starbucks brand and operates under a local legal entity.

Persons

Persons represent the individuals linked to the business graph—owners, officers, registered agents, and other key personnel. Person entities connect to legal entities (a person is an instance of a legal entity in the case of sole proprietors) and to roles performed at brands or operating locations.

Example: The registered agent for Starbucks Corporation listed on a Secretary of State filing is a Person entity, linked to that legal entity and potentially to specific operating locations where they hold a role.

Relationships Between Entities

Enigma's data model captures the relationships that connect these core entities, such as:

  • Brand-to-Location: Which brands operate at which locations.
  • Brand-to-Legal Entity: Which legal entities own or manage a brand.
  • Location-to-Legal Entity: Which legal entities are responsible for specific operating locations.
  • Person-to-Legal Entity: Which individuals are officers, owners, or registered agents of a legal entity.
  • Person-to-Brand / Person-to-Location: Which individuals hold roles at specific brands or operating locations.

The Complex Nature of Businesses

Businesses often include:

  • Multiple Legal Entities: A company may establish separate legal entities for different locations or functions.
  • Multiple Brands: Corporations like Gap Inc. operate distinct brands like Old Navy and Banana Republic.
  • Affiliated Brands: Dealers (e.g., Curry Honda) or co-locations (e.g., Sephora at JCPenney).
  • Franchises: Independent operators under a shared brand, such as McDonald's franchisees.
  • Agents and Professionals: Individuals operating under umbrella brands (e.g., "James Lavelle, State Farm Agent").
  • Medical Providers: Patients often seek specific doctors who work within practices owned by larger health systems, blending individual and institutional branding.
  • Persons as Brands: In services like hairstyling or therapy, individuals are the brand.
  • Legal Entities as Brands: Some businesses use their legal name as their brand.

Data Sources

Enigma's data is built on four primary source types that together power ground truth business identity.

Government sources form the authoritative foundation. Corporate registrations, Secretary of State filings, franchise disclosures, and professional licenses establish the legal identity map of U.S. businesses. Because every record originates from a government body, provenance is clear — making this layer essential for KYB, compliance, and due diligence workflows.

Online sources capture the operational reality of a business — websites, directories, review platforms, and social profiles. This data is inherently dynamic and often reflects changes in hours, location, or status before they appear in any official filing. That makes online sources a critical early-signal layer, keeping entity profiles current between government record updates.

Third-party providers extend coverage with targeted insights such as enriched firmographics and contact information. Rather than treated as ground truth, third-party data is weighted and reconciled against other sources to extend attribute breadth without compromising data integrity.

Card panel data is sourced from a consortium of issuer banks aggregating actual card transaction data at the merchant level. This delivers real revenue trends, transaction volumes, customer counts, and growth rates derived from what businesses actually earn — across 750M+ credit and debit cards in the U.S.


Data Quality

Enigma validates data quality continuously against labeled ground truth data. Key precision benchmarks:

AreaMetric
Entity linking — brands to legal entities95% precision
Entity linking — brands to operating locations94% of brands have complete location links
Industry classification — NAICS code assignments98% precision
Location data — operating location status and addresses95% precision
Card revenue estimates — within ±30% of actual values67% of brands

For a detailed walkthrough of Enigma's data quality methodology, see the Enigma Data Quality whitepaper. For card revenue accuracy specifics, see Understanding Card Revenue Data.


Data Delivery

The output of Enigma's data pipeline is available through three delivery channels:

  • Bulk file delivery — for customers who want to work with the full dataset. Files are delivered in CSV or Parquet format to the Enigma Console, your S3 bucket, or an SFTP server. Best for list generation, market analysis, and offline enrichment workflows.
  • Enigma API — for real-time query and enrichment use cases. The GraphQL API lets you search and retrieve entity data programmatically, record by record, and integrate it directly into your applications or underwriting workflows. See the GraphQL API guide.
  • Agentic workflows — for automated decisioning and orchestration. Enigma's MCP server exposes entity data as tools that AI agents can call directly, enabling natural language queries and automated enrichment pipelines. See the AI & MCP integrations guide.