The CSV data source adapter allows to query CSV files as relational tables. Please make sure that the first line of your CSV files contain a header according to the specification bellow. The adapter supports multiple CSV files. Every file is represented as a table using the filename as table name. Make sure that the file name contains no spaces. The adapter also supports files compressed using gzip. The file name must either end with
The CSV adapter is read-only. DML queries are not supported. The content of the file can be changed in the background as long as the schema (number of columns and there type) is not changed.
The first column is taken as primary key. Therefore, it may only contain unique values. Please notice that the CSV adapter does not support null values.
The CSV adapter has two settings:
|directory||The location of the directory containing the CSV files.|
|maxStringLength||Which length (number of characters including whitespace) should be used for the varchar columns. Make sure this is equal or larger than your longest string in any of the columns.|
The first line of the CSV file needs to contain a header specifying the name and data type of every column. The column name and the data type need to be separated by a colon:
Supported Data Types
The CSV adapter supports the following data types:
|boolean||Allowed values: |
|double||Floating point number using a ‘.’ (dot) as decimal separator. Mapped to the data type DOUBLE (8 bytes, IEEE 754).|
|float||Floating point number using a ‘.’ (dot) as decimal separator. Mapped to the data type REAL (4 bytes, IEEE 754).|
|int||Integer number in the range |
|long||Integer number in the range |
|string||String (can contain letters, numbers, and special characters). Wrapped in double quotes if containing commas. Mapped to a VARCHAR. Length can be specified using the adapter parameter |
The adapter can either be deployed using the Polypheny-UI (Adapters -> Sources) or using the following SQL statement:
Please make sure to adjust
csvFolder and the
maxStringLength according to your needs.
After successful deployment, all CSV files are mapped as a table in the public schema. The tables and columns can be renamed. Furthermore, columns can be reordered and dropped if they are not required (they are not deleted from the file). If you have changed your mind, dropped columns can be added again using the Polypheny-UI or this special SQL statement:
physicalName referes to the name specified in the header of the CSV file.