How to read and parse CSV file in Java

摘要: A Comma-Separated Values (CSV) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g normally it is a comma “,”).

A Comma-Separated Values (CSV) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g normally it is a comma “,”).

For example :

1,US,United States
2,MY,Malaysia
3,AU,Australia

or

"1","US","United States"
"2","MY","Malaysia"
"3","AU","Australia"

Note
Read this RFC4180 document for Comma-Separated Values (CSV) format.

In a CSV file, normally there are two issues:

1. The field containing separator, for example, separator is a comma, and the field containing commas:

"aaa","b,bb","ccc"

2. The double-quotes are used to enclose fields, and the field containing double-quotes. To solve it, a double-quote appearing inside a field must be escaped by preceding it with another double quote (RFC4180)

"aaa","b""bb","ccc"

In this tutorial, we show you three examples to read, parse and print out the values from a CSV file.

  1. Simple solution to parse a simple formatted CSV file.

  2. Advance solution to parse a weird formatted CSV file (field containing separator or double-quotes)

  3. Third party solution, OpenCSV example.

1. Simple Solution

If you are sure the CSV files doesn’t contain “separator or double-quotes”, just use the standard split() to parse the CSV file.

1.1 Review a simple CSV file

/Users/mkyong/csv/country.csv

"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"

1.2 No magic, just read above text file, and splits it by a comma separator.

CSVReader.csv

package com.mkyong.csv;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class CSVReader {
    public static void main(String[] args) {
        String csvFile = "/Users/mkyong/csv/country.csv";
        BufferedReader br = null;
        String line = "";
        String cvsSplitBy = ",";
        try {
            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) {
                // use comma as separator
                String[] country = line.split(cvsSplitBy);
                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (br != null) {
                try {
                    br.close();
                } catch (IOException e) {
                    e.printStackTrace();

1.3 For JDK 7 and above, uses try-resources.

CSVReader.csv

package com.mkyong.csv;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class CSVReader {
    public static void main(String[] args) {
        String csvFile = "/Users/mkyong/csv/country.csv";
        String line = "";
        String cvsSplitBy = ",";
        try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
            while ((line = br.readLine()) != null) {
                // use comma as separator
                String[] country = line.split(cvsSplitBy);
                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");
        } catch (IOException e) {
            e.printStackTrace();

Output

Country [code= "AU" , name="Australia"]
Country [code= "CN" , name="China"]
Country [code= "AU" , name="Australia"]
Country [code= "CN" , name="China"]
Country [code= "JP" , name="Japan"]
Country [code= "CN" , name="China"]
Country [code= "JP" , name="Japan"]
Country [code= "TH" , name="Thailand"]

2. Advance Solution

This solution will solve the field containing “separator or double-quotes” issue, and also support the custom separator and custom enclosed field. Review the following CSV parsing example and also the JUnit test cases to understand how it works.

Note
Again, if a double-quote appearing inside a field must be escaped by preceding it with another double quote, for example

"aaa","b""bb","ccc"

2.1 Review another CSV file

/Users/mkyong/csv/country2.csv

10,AU,Australia
11,AU,Aus""tralia
"12","AU","Australia"
"13","AU","Aus""tralia"
"14","AU","Aus,tralia"

2.2 The below example is inspired by this article – “Only Class You Need for CSV Files” (with some fixed to support other features, read “fixed comments”) and also this third party OpenCSV library.

CSVUtils.java

package com.mkyong.utils;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class CSVUtils {
    private static final char DEFAULT_SEPARATOR = ',';
    private static final char DEFAULT_QUOTE = '"';
    public static void main(String[] args) throws Exception {
        String csvFile = "/Users/mkyong/csv/country2.csv";
        Scanner scanner = new Scanner(new File(csvFile));
        while (scanner.hasNext()) {
            List<String> line = parseLine(scanner.nextLine());
            System.out.println("Country [id= " + line.get(0) + ", code= " + line.get(1) + " , name=" + line.get(2) + "]");
        scanner.close();
    public static List<String> parseLine(String cvsLine) {
        return parseLine(cvsLine, DEFAULT_SEPARATOR, DEFAULT_QUOTE);
    public static List<String> parseLine(String cvsLine, char separators) {
        return parseLine(cvsLine, separators, DEFAULT_QUOTE);
    public static List<String> parseLine(String cvsLine, char separators, char customQuote) {
        List<String> result = new ArrayList<>();
        //if empty, return!
        if (cvsLine == null && cvsLine.isEmpty()) {
            return result;
        if (customQuote == ' ') {
            customQuote = DEFAULT_QUOTE;
        if (separators == ' ') {
            separators = DEFAULT_SEPARATOR;
        StringBuffer curVal = new StringBuffer();
        boolean inQuotes = false;
        boolean startCollectChar = false;
        boolean doubleQuotesInColumn = false;
        char[] chars = cvsLine.toCharArray();
        for (char ch : chars) {
            if (inQuotes) {
                startCollectChar = true;
                if (ch == customQuote) {
                    inQuotes = false;
                    doubleQuotesInColumn = false;
                } else {
                    //Fixed : allow "" in custom quote enclosed
                    if (ch == '\"') {
                        if (!doubleQuotesInColumn) {
                            curVal.append(ch);
                            doubleQuotesInColumn = true;
                    } else {
                        curVal.append(ch);
            } else {
                if (ch == customQuote) {
                    inQuotes = true;
                    //Fixed : allow "" in empty quote enclosed
                    if (chars[0] != '"' && customQuote == '\"') {
                        curVal.append('"');
                    //double quotes in column will hit this!
                    if (startCollectChar) {
                        curVal.append('"');
                } else if (ch == separators) {
                    result.add(curVal.toString());
                    curVal = new StringBuffer();
                    startCollectChar = false;
                } else if (ch == '\r') {
                    //ignore LF characters
                    continue;
                } else if (ch == '\n') {
                    //the end, break!
                    break;
                } else {
                    curVal.append(ch);
        result.add(curVal.toString());
        return result;

Output

Country [id= 10, code= AU , name=Australia]
Country [id= 11, code= AU , name=Aus"tralia]
Country [id= 12, code= AU , name=Australia]
Country [id= 13, code= AU , name=Aus"tralia]
Country [id= 14, code= AU , name=Aus,tralia]

3.3 Review the following unit test, it test the “commas and double quotes” issue.

CSVUtilsTest.java

package com.mkyong.csv;
import com.mkyong.utils.CSVUtils;
import org.hamcrest.core.IsNull;
import org.junit.Test;
import java.util.List;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.core.Is.is;
public class CSVUtilsTest {
    @Test
    public void test_no_quote() {
        String line = "10,AU,Australia";
        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));
    @Test
    public void test_no_quote_but_double_quotes_in_column() throws Exception {
        String line = "10,AU,Aus\"\"tralia";
        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));
    @Test
    public void test_double_quotes() {
        String line = "\"10\",\"AU\",\"Australia\"";
        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));
    @Test
    public void test_double_quotes_but_double_quotes_in_column() {
        String line = "\"10\",\"AU\",\"Aus\"\"tralia\"";
        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));
    @Test
    public void test_double_quotes_but_comma_in_column() {
        String line = "\"10\",\"AU\",\"Aus,tralia\"";
        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus,tralia"));

3.4 Review another unit test, it test the custom separator and custom enclosed field.

CSVUtilsTestCustom.java

package com.mkyong.csv;
import com.mkyong.utils.CSVUtils;
import org.hamcrest.core.IsNull;
import org.junit.Test;
import java.util.List;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.core.Is.is;
public class CSVUtilsTestCustom {
    @Test
    public void test_custom_separator() {
        String line = "10|AU|Australia";
        List<String> result = CSVUtils.parseLine(line, '|');
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));
    @Test
    public void test_custom_separator_and_quote() {
        String line = "'10'|'AU'|'Australia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));
    @Test
    public void test_custom_separator_and_quote_but_custom_quote_in_column() {
        String line = "'10'|'AU'|'Aus|tralia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus|tralia"));
    @Test
    public void test_custom_separator_and_quote_but_double_quotes_in_column() {
        String line = "'10'|'AU'|'Aus\"\"tralia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));

3. OpenCSV Example

If you are not comfortable with above simple and advance solution, try using third party CSV library – OpenCSV.

3.1 Maven.

pom.xml

	<dependency>
		<groupId>com.opencsv</groupId>
		<artifactId>opencsv</artifactId>
		<version>3.8</version>
	</dependency>

3.2 Review a CSV file.

/Users/mkyong/csv/country3.csv

10,AU,Australia
11,AU,Aus""tralia
"12","AU","Australia"
"13","AU","Aus""tralia"
"14","AU","Aus,tralia"

3.2 OpenCSV example to parse above CSV file.

CSVReaderExample.java

package com.mkyong.csv;
import com.opencsv.CSVReader;
import java.io.FileReader;
import java.io.IOException;
public class CSVReaderExample {
    public static void main(String[] args) {
        String csvFile = "/Users/mkyong/csv/country3.csv";
        CSVReader reader = null;
        try {
            reader = new CSVReader(new FileReader(csvFile));
            String[] line;
            while ((line = reader.readNext()) != null) {
                System.out.println("Country [id= " + line[0] + ", code= " + line[1] + " , name=" + line[2] + "]");
        } catch (IOException e) {
            e.printStackTrace();

Output

Country [id= 10, code= AU , name=Australia]
Country [id= 11, code= AU , name=Aus"tralia]
Country [id= 12, code= AU , name=Australia]
Country [id= 13, code= AU , name=Aus"tralia]
Country [id= 14, code= AU , name=Aus,tralia]

Note
Please refer to this OpenCSV official documentation for more examples.

Done.

References

  1. The Only Class You Need for CSV Files

  2. CSVHelper Example

  3. Ostermiller Java Utilities – Comma Separated Values (CSV)

  4. RFC4180 – Format for Comma-Separated Values (CSV)

  5. OpenCSV website

  6. Java – How to export data to CSV file

上一篇: Spring asm dependency issue in Spring Data
下一篇: Display a list of countries in Java
 评论 ( What Do You Think )
名称
邮箱
网址
评论
验证
   
 

 


  • 微信公众号

  • 我的微信

站点声明:

1、一号门博客CMS,由Python, MySQL, Nginx, Wsgi 强力驱动

2、部分文章或者资源来源于互联网, 有时候很难判断是否侵权, 若有侵权, 请联系邮箱:summer@yihaomen.com, 同时欢迎大家注册用户,主动发布无版权争议的 文章/资源.

3、鄂ICP备14001754号-3, 鄂公网安备 42280202422812号