Live data connections to Lexio can be achieved through our data dropzone. Data is pushed to Lexio on a recurring basis via the dropzone. The Lexio dropzone is a data lake hosted by Narrative Science (using AWS S3). 


We will provide Unique Credentials to allow customers to drop file into the customer-specific location.




Structure and Schemas


Regardless of the method by which data loads in the data lake, the structure of the integration folder (i.e. the top-level folder in the bucket) should conform to: 


INTEGRATION_NAME/

        TABLE_NAME/

                *[.parquet|.csv...]


All of the files for each table should have the same format and the same schema. We leverage a Glue Crawler to infer the format and schema via its default classifier. Read the docs to learn about the file formats it supports. In general, you should use Parquet (preferred) and standard CSVs.


For CSVs the file must have:


  • utf-8 as its encoding
  • , as its field delimiter
  • \n as its line delimiter
  • \ as its escape character



NOTE - Each upload should contain all of the historical data and fully overwrite the older file. The filename should not change with each upload.JSON




Example Data Flows


While data can flow into the Lexio dropzone from anywhere through custom scripting, here are some examples and resources that may be helpful for configuring the live connection: