File Delivery
If a List Generation workflow is suitable for the use case at hand, the file delivery service is the best way to access Enigma data.
Please reach out to the Enigma team if you need to generate a list. The steps below outline how to best prepare for that conversation.
Requesting the Desired Output
It's important to be aware of the choices available when generating a list so that the output file meets the requirements of your use case. The following are some things to consider.
Entity Type
Indicate whether the desired file should contain a list of businesses or business locations. One file will contain a list of only one of these entity types.
Filters
Provide the Enigma team with an idea of the type of business or business location being targeted. To get an idea of the dimensions along which the Enigma data can be filtered see the Attribute Dictionary.
Attributes
When discussing a list generation with a member of the Enigma team it helps to familiarize yourself with the Attribute Dictionary. This contains explanations of the various attributes available to include in the output file alongside the basic identifiers of the business or business location like business name, address, Enigma ID, etc.
Some attributes are available as a monthly time-series (currently, only the Merchant Transaction Signals). Let the Enigma team know whether you are interested in only the most recently available month of transaction related data or multiple months of history.
Output Format
- CSV
- Parquet
Standard comma-separated values format, ideal for spreadsheet applications and general analysis.
Columnar storage format, optimized for large-scale data processing and analytics workflows.
File Structure
The structure of the output file is also something that can be customized.
Enigma data is stored as a collection of attributes. Some of these attributes are represented as objects with properties associated with them. The file can be structured so that all such attributes are either flattened or unflattened.
- Flattened Structure
- Unflattened Structure
- No columns with nested values
- Each property becomes its own column
- Best for spreadsheet analysis
- Example:
industries.classification_type
becomes a separate column
- Contains columns with nested values
- Properties remain nested within parent objects
- Optimized for data pipeline ingestion
- Example: All properties remain nested in
industries
column
Number of output files
In most cases, Enigma recommends sending the output back in one file. However, there are specific scenarios where splitting the data is beneficial:
- When combining firmographics with time-series data
- When including both business and business location matches
Here are the cases where multiple files are recommended:
-
Firmographics + Time Series: If a user wants both firmographics attributes and multiple months of history for a time-series attribute, we recommend:
- One file for firmographics
- One file for time-series data
- The Enigma ID serves as the matching key between files
-
Mixed Entity Types: If a user wants to see matches for both businesses and business locations, we recommend:
- Separate file for business matches
- Separate file for business location matches
Retrieving Files from Enigma
There are two standard ways you may retrieve files output by Enigma:
-
Downloaded directly via the Enigma Console File Manager
-
Published into a user-defined Data Source:
- Your SFTP server
- Your Amazon S3 Bucket
- A private SFTP account on an Enigma SFTP server
For information on setting up a Data Source, please reference the Console File Manager documentation.
Due to a technical limitation at this time:
- Parquet files cannot be downloaded directly from the Console File Manager web interface
- Parquet files can be accessed when copied to a user-defined Data Source
- As a temporary workaround, Parquet files may be converted to CSV, TSV, or PSV for web interface downloads
- Contact support@enigma.com for assistance with Parquet file access
Interpreting the Output
The output file will contain columns representing attributes describing a business or business location.
- Enigma ID is always included as the first column
- Column names follow a consistent naming pattern
- Custom column names available upon request
Column Naming Conventions
Pattern | Example | Description |
---|---|---|
attribute**X | names**0 | Array elements, where X is the index |
attribute**X*_property | addresses**0*_street_address1 | Array of objects |
attribute__property | card_revenue__growth__3m__rate_sa | Nested object properties |
attribute__property__X | industries__codes__0 | Nested arrays |
If a time-series attribute was selected (currently, only the Merchant Transaction Signals attributes), there will be multiple rows per Enigma ID - one for each month in the time series.
For more information about our attributes, see:
For more information about file delivery options, see the Console File Manager documentation.