Deutsch: Logo der Christlich Sozialen Volkspartei

Loading CSV file into Database can be cumbersome task if your Database provider does not offer an out of box feature for this. Most of the time you’ll spend up in creating valid insert statements and putting up values escaping all special characters. Importing CSV files gets a bit complicated when you start doing things like importing files with description fields that can contain punctuation (such as commas or single-double quotation marks).

So here’s a simple Java Utility class that can be used to load CSV file into Database. Note how we used some of the best practices for loading data. The CSV file is parsed line by line and SQL insert query is created. The values in query are binded and query is added to SQL batch. Each batch is executed when a limit is reached (in this case 1000 queries per batch).

Import CSV into Database example

Let’s us check an example. Below is the sample CSV file that I want to upload in database table Customer.

employee.csv – Sample CSV file:

EMPLOYEE_ID,FIRSTNAME,LASTNAME,BIRTHDATE,SALARY
1,Dean,Winchester,27.03.1975,60000
2,John,Winchester,01.05.1960,120000
3,Sam,Winchester,04.01.1980,56000

The Table customer contains few fields. We added fields of different types like VARCHAR, DATE, NUMBER to check our load method works properly.

Table: Customer – Database table

CREATE TABLE Customer (
  EMPLOYEE_ID  NUMBER,
  FIRSTNAME    VARCHAR2(50 BYTE),
  LASTNAME     VARCHAR2(50 BYTE),
  BIRTHDATE    DATE,
  SALARY       NUMBER
)

Following is a sample Java class that will use CSVLoader utility class (we will come to this shortly).

Main.java – Load sample.csv to database

package net.viralpatel.java;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
public class Main {
    private static String JDBC_CONNECTION_URL =
            "jdbc:oracle:thin:SCOTT/TIGER@localhost:1500:MyDB";
    
    public static void main(String[] args) {
        try {
            CSVLoader loader = new CSVLoader(getCon());
            
            loader.loadCSV("C:\\employee.sql", "CUSTOMER", true);
            
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    private static Connection getCon() {
        Connection connection = null;
        try {
            Class.forName("oracle.jdbc.driver.OracleDriver");
            connection = DriverManager.getConnection(JDBC_CONNECTION_URL);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
        } catch (SQLException e) {
            e.printStackTrace();
        }
        return connection;
    }
}

In above Main class, we created an object of class CSVLoader using parameterized constructor and passed java.sql.Connection object.

Then we called the loadCSV method with three arguments. First the path of CSV file, second the table name where data needs to be loaded and third boolean parameter which decides whether table has to be truncated before inserting new records.

Execute this Java class and you’ll see the records getting inserted in table.

csv-load-java-database-example

The CSV is successfully loaded in database.

Let’s check the Utility class now. I strongly recommend you to go through below tutorials as the Utility class combines the idea from these tutorials.

  1. Batch Insert In Java – JDBC
  2. Read / Write CSV file in Java
  3. Check if String is valid Date in Java

The utility class uses OpenCSV library to load and parse CSV file. Then it uses the idea of Batching in JDBC to batch insert queries and execute them. Each CSV value is checked if it is valid date before inserting.

CSVLoader.java – Utility class to load CSV into Database

package net.viralpatel.java;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.util.Date;
import org.apache.commons.lang.StringUtils;
import au.com.bytecode.opencsv.CSVReader;
/**
 *
 * @author viralpatel.net
 *
 */
public class CSVLoader {
    private static final
        String SQL_INSERT = "INSERT INTO ${table}(${keys}) VALUES(${values})";
    private static final String TABLE_REGEX = "\\$\\{table\\}";
    private static final String KEYS_REGEX = "\\$\\{keys\\}";
    private static final String VALUES_REGEX = "\\$\\{values\\}";
    private Connection connection;
    private char seprator;
    /**
     * Public constructor to build CSVLoader object with
     * Connection details. The connection is closed on success
     * or failure.
     * @param connection
     */
    public CSVLoader(Connection connection) {
        this.connection = connection;
        //Set default separator
        this.seprator = ',';
    }
    
    /**
     * Parse CSV file using OpenCSV library and load in
     * given database table.
     * @param csvFile Input CSV file
     * @param tableName Database table name to import data
     * @param truncateBeforeLoad Truncate the table before inserting
     *          new records.
     * @throws Exception
     */
    public void loadCSV(String csvFile, String tableName,
            boolean truncateBeforeLoad) throws Exception {
        CSVReader csvReader = null;
        if(null == this.connection) {
            throw new Exception("Not a valid connection.");
        }
        try {
            
            csvReader = new CSVReader(new FileReader(csvFile), this.seprator);
        } catch (Exception e) {
            e.printStackTrace();
            throw new Exception("Error occured while executing file. "
                    + e.getMessage());
        }
        String[] headerRow = csvReader.readNext();
        if (null == headerRow) {
            throw new FileNotFoundException(
                    "No columns defined in given CSV file." +
                    "Please check the CSV file format.");
        }
        String questionmarks = StringUtils.repeat("?,", headerRow.length);
        questionmarks = (String) questionmarks.subSequence(0, questionmarks
                .length() - 1);
        String query = SQL_INSERT.replaceFirst(TABLE_REGEX, tableName);
        query = query
                .replaceFirst(KEYS_REGEX, StringUtils.join(headerRow, ","));
        query = query.replaceFirst(VALUES_REGEX, questionmarks);
        System.out.println("Query: " + query);
        String[] nextLine;
        Connection con = null;
        PreparedStatement ps = null;
        try {
            con = this.connection;
            con.setAutoCommit(false);
            ps = con.prepareStatement(query);
            if(truncateBeforeLoad) {
                //delete data from table before loading csv
                con.createStatement().execute("DELETE FROM " + tableName);
            }
            final int batchSize = 1000;
            int count = 0;
            Date date = null;
            while ((nextLine = csvReader.readNext()) != null) {
                if (null != nextLine) {
                    int index = 1;
                    for (String string : nextLine) {
                        date = DateUtil.convertToDate(string);
                        if (null != date) {
                            ps.setDate(index++, new java.sql.Date(date
                                    .getTime()));
                        } else {
                            ps.setString(index++, string);
                        }
                    }
                    ps.addBatch();
                }
                if (++count % batchSize == 0) {
                    ps.executeBatch();
                }
            }
            ps.executeBatch(); // insert remaining records
            con.commit();
        } catch (Exception e) {
            con.rollback();
            e.printStackTrace();
            throw new Exception(
                    "Error occured while loading data from file to database."
                            + e.getMessage());
        } finally {
            if (null != ps)
                ps.close();
            if (null != con)
                con.close();
            csvReader.close();
        }
    }
    public char getSeprator() {
        return seprator;
    }
    public void setSeprator(char seprator) {
        this.seprator = seprator;
    }
}

The class looks complicated but it is simple :)

The loadCSV methods combines the idea from above three tutorials and create insert queries.

Following is the usage of this class if you want to use it in your project:

Usage

CSVLoader loader = new CSVLoader(connection);
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", true);

Load file with semicolon as delimeter:

CSVLoader loader = new CSVLoader(connection);
loader.setSeparator(';');
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", true);

Load file without truncating the table:

CSVLoader loader = new CSVLoader(connection);
loader.loadCSV("C:\\employee.csv", "TABLE_NAME", false);

Hope this helps.

Download Source Code

Load_CSV_Database_Java_example.zip (2.05 MB)

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s