# Trailing line breaks

Trailing line breaks are usually a minor problem, but it is useful to be aware of them. Some CSV files have a few additional blank lines at the end. This can confuse some CSV parsers into thinking that they are dealing with actual records where all the fields are empty. Importing a file like that can lead to records with many missing values and distort summary statistics.

```
'ID','FIRST_NAME'\r\n
1,'Adam'\r\n
2,'Billy'\r\n
3,'Elizabeth'\r\n
\r\n
\r\n
```

A bad parser could read the above CSV and interpret the content as

| `ID`   | `FIRST_NAME` |
| ------ | ------------ |
| 1      | Adam         |
| 2      | Billy        |
| 3      | Elizabeth    |
| `NULL` | `NULL`       |
| `NULL` | `NULL`       |

To avoid this, check the last few lines of the CSV file manually, for example with the command line tool `tail` on Linux or by looking at the CSV file in Notepad++, if it isn't too big. You can also check for NULL values in a column that should absolutely not be NULL. If you find such a record and all the other columns are NULL too, that is an indicator for trailing line breaks as well.

You can also read the file in as a plain text file and calculate the number of characters in every line. If there are N columns, there should be at least N-1 column separators, i.e. `,` or `;` characters. Any column with less characters is obviously an error and should be inspected and possibly excluded.

Takeaways:

* Inspect the last few lines of a CSV file using the tail command or Notepad++
* Check for rows where all values are NULL Count the number of column separators in each row. With N columns, there should be at least N-1 occurrences of the column separator


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://b-greve.gitbook.io/beginners-guide-to-clean-data/common-csv-problems/trailing-line-breaks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
