Mobile Menu

SAP HANA :: Loading Data from Flat Files via SAP HANA Studio

Let’s talk about data loading into HANA DB tables. There are different ways to do it, but this time let’s look into uploading data from flat files via HANA Studio and deep into some nuances of this process.

The whole staff depends on what we have. So:

I. Target table is created

We have our table created, and we have a simple CSV file. And we have the simplest way to manage our task with the help of “Data From Local File” import option. Nice to mention that it allows loading data from xls, xlsx and csv files.

1. Go to menu File -> Import and choose SAP HANA Content -> Data From Local File source. Click Next.

Note: if you have no “Data From Local File”, check your HANA version, it should be at least SPS04 (1.00.41.XXXXXX).

2. Select the target system where you want to load data. Click Next

3. Define import parameters:

  • Source file: browse the file to upload
  • File Details: Choose the Field Delimiter (Comma / Semi Colon / Colon) and select Worksheet (if the file has several sheets, otherwise this option would be disabled); Header row exists: check out if the flat file has header row and set the header row number. Otherwise uncheck this checkbox; Import all data: either import all rows or part of them (specify from – to lines’ numbers).
  • Target Table: if the table exists, choose Existing option and browse it. Otherwise go to the next situation guide for details.

4. Click next. Here is the mapping of source file columns with target table columns. If source file doesn’t contain header, you see COLUMN_N in proposal structure, just map the corresponding columns from left side to the right by dropping proper columns. At the bottom you find data preview. Check everything is correct and click Next.

5. In Summary screen is shown how it will look like. If the target table already contains data it will be shown in Data from Existing Table. Click Finish.

Note: new data will be appended to the end of the table.

6. If load succeeded, you’ll see ‘Completed successfully’ message in Job Log:

Note: if the status is not so positive, check your source file, the delimiter, types of columns, if you have DATE type there, be sure it has correct format.

7. Now it’s time to enjoy our table full of data! Refresh the schema and find the table in Tables folder, click on it with right-button-mouse and select

Open Data Preview or Open Content and you’ll see the result:

II. Target table is not created

Let’s complicate a bit our task. Now we have no table, but luckily our file is still simple. The only requisite here is that the file should contains header.

1. Repeat steps 1 – 2 from previous guide. Now we are asked to define Import Properties. Fill it as before, except Target Table section – here we select New Schema and Table name:

2. Click Next

3. As the target table is not exist yet, check the proposed definition and do not forget to check out the Key field. Also be careful with text fields’ length – initial it is set to the highest within the source file, but later you may need it longer.

To change the order of the fields (Up or Down) or insert new fields or delete the existing fields use the top right corner icons.

Choose Store Type (Column or Row).

4. Click Next. You’ll see Summary screen with data preview. Click Finish.

5. If everything is Ok, you’ll see ‘Completed successfully’ message in Job Log:

III. Loading data in multiple tables

And what if we have several tables, e.g. 20? It will be too boring to fill all of them in such a way. Let’s try another possibility to load data from flat file.

Prerequisites:

  • prepare some different CSV files – with and without header line, with different delimiter (“;”, ”,”);
  • be sure you know the path to your flat files (place them into the proper directory, e.g. data package of your project);
  • the target tables must exist.

Note: this project was made in HANA Studio as Eclipse plug-in.

1. Create new file in your project: right-mouse click at data package, select New -> File. Enter file name with .hdbti extension. Click Finish.

2. Place there the code below (replace values in [] with proper one, without []). Header = true/false means whether the flat file contains header line.

Set the delimiter in delimField. By default it is “,” this parameter is optional.

import = [
{
    table = "[your_path].data::[first_table_name]";
    schema = "[schema_name]";
    file = "[your_path].data:[first_flat_file].csv";
    header = true;
    delimField = ";";
},
…
{
    table = "[your_path].data::[n_table_name]";
    schema = "[schema_name]";
    file = "[your_path].data:[n_flat_file].csv";
    header = false;
    delimField = ",";
}]

3. Important: this method will overwrite all previous data in the table!!!

Commit and Activate all the files involved (flat files, .hdbti file). As it is done, you may check if your data is loaded. Again, refresh schema and find the tables in Table folder.

Note: if during activation you get an error see the log messages, check flat files and the syntax of the Table-Import Configuration-File (.hdbti file).

IV. Loading Large files

Finally, we got a SCV file that is about 4 Mb. It can be quite a slow process to load such a big file, so let’s sort out with a third flat file uploading method. Now we will be using a CTL file (control file).

1. Content of the control file can be:

IMPORT DATA INTO TABLE "[schema_name]"."[table_name]" FROM '/[path_to_csv]/data.csv'
RECORD DELIMITED BY '\n'
FIELD DELIMITED BY ',';

or

IMPORT FROM CSV FILE '/[path_to_csv]/data.csv' INTO "[table_name]"
WITH
RECORD DELIMITED BY '\n'
FIELD DELIMITED BY ',';

Tips:

  • Use BATCH clause for better performance while importing large number of rows.
  • Include an ERROR LOG file to ensure no errors are generated. Note, it is read only, so it won’t be refreshed next time.
  • Use FAIL ON INVALID DATA to make load process fail as it encounters first error.
  • Use DATE FORMAT to specify the date format you need (by default YYYY-MM-DD is expected).
  • Check no special symbols are in the flat file.
IMPORT DATA INTO TABLE "[table_name]" FROM '/[path_to_csv]/data.csv'
RECORD DELIMITED BY '\n'
FIELD DELIMITED BY ','
BATCH 1000
ERROR LOG '/[path_to_log]/error.log'
FAIL ON INVALID DATA
DATE FORMAT 'MM/DD/YYYY';

2. Run an SQL command

IMPORT FROM '[ctl_file].ctl';
UPDATE [SCHEMA_NAME].[table_name] MERGE DELTA INDEX;

Note: for files larger than 4 Mb the loading data using importing CSV files into HANA Studio is not the best way.

For bulk load use another methods:

  • A linked table in Microsoft Access (using HANA ODBC driver);
  • A web service hosted in HANA in combination with SAPUI5 File Upload control.