Python CSV

CSV, or Comma-Separated Values, is a simple file format used to store tabular data, such as spreadsheets or databases. It's widely used for data exchange between applications due to its simplicity and readability.

Key Features of CSV

Structure:
CSV files consist of rows and columns, where each row represents a record and each column represents a field.
Values within a row are separated by commas (or other delimiters like semicolons or tabs).

File Extension:
CSV files typically have a .csv extension.

Plain Text:
CSV files are plain text files, making them easy to create, edit, and read with any text editor.

Flexibility:
Different applications can use different delimiters (e.g., commas, tabs), but commas are the most common.

No Data Types:
CSV files do not inherently support data types (like integers, dates, etc.). All values are treated as strings.

Example of a CSV File:
Name, Age, City
Alice, 30, New York
Bob, 25, Los Angeles
Charlie, 35, Chicago

Use Cases

Data Import/Export: Many applications (like Excel, Google Sheets, and databases) can import and export data in CSV format.

Data Analysis: CSV is commonly used in data analysis for easy data manipulation and processing.

Interoperability: Because of its simplicity, CSV is a great way to share data between different software systems.

Limitations:
Lack of Standardization: Different applications may handle special characters, delimiters, and line endings differently, leading to compatibility issues.

No Support for Complex Data: CSV is not suitable for hierarchical or complex data structures, as it only supports flat tables.

The csv module in Python allows applications to work with CSV files by reading from and writing to CSV data in a simple and efficient manner.

1. Importing the csv Module

Start by importing the csv module to enable CSV file operations.
import csv

2. Writing Data to a CSV File

Use csv.writer() to write rows to a CSV file. Each row can be written as a list of values.
# Data to write to CSV
data = [
    ["Name", "Age", "City"],
    ["Alice", 25, "New York"],
    ["Bob", 30, "Los Angeles"],
    ["Charlie", 35, "Chicago"]
]

# Open a CSV file in write mode
with open("output.csv", mode="w", newline="") as file:
    writer = csv.writer(file)
    writer.writerows(data)
print("Data written to CSV successfully.")

Output:

Data written to CSV successfully.
Explanation: csv.writer() writes each row in data to output.csv. The writerows() method writes multiple rows at once.

3. Reading Data from a CSV File

To read data from a CSV file, use csv.reader(). Each row will be read as a list.
# Open a CSV file in read mode
with open("output.csv", mode="r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

Output:

['Name', 'Age', 'City']
['Alice', '25', 'New York']
['Bob', '30', 'Los Angeles']
['Charlie', '35', 'Chicago']
Explanation: csv.reader() reads each line of the CSV file and returns each row as a list. This makes it easy to process each row individually.

4. Writing Data to a CSV File with Custom Delimiter

To use a custom delimiter (e.g., semicolon), specify the delimiter argument in csv.writer().
# Open a CSV file with semicolon delimiter
with open("output_semicolon.csv", mode="w", newline="") as file:
    writer = csv.writer(file, delimiter=";")
    writer.writerows(data)
print("Data written to CSV with semicolon delimiter successfully.")

Output:

Data written to CSV with semicolon delimiter successfully.
Explanation: Specifying delimiter=";" changes the delimiter to a semicolon, which is often used in regions where commas are used as decimal points.

5. Reading Data with Custom Delimiter

To read data with a custom delimiter, specify the delimiter argument in csv.reader().
# Read CSV with semicolon delimiter
with open("output_semicolon.csv", mode="r") as file:
    reader = csv.reader(file, delimiter=";")
    for row in reader:
        print(row)

Output:

['Name', 'Age', 'City']
['Alice', '25', 'New York']
['Bob', '30', 'Los Angeles']
['Charlie', '35', 'Chicago']
Explanation: Specifying delimiter=";" allows reading semicolon-separated values, making the function adaptable to different CSV formats.

6. Writing Data with a DictWriter

Use csv.DictWriter to write data as dictionaries, where each row is a dictionary with keys corresponding to column headers.
# Data to write as dictionary
data_dict = [
    {"Name": "Alice", "Age": 25, "City": "New York"},
    {"Name": "Bob", "Age": 30, "City": "Los Angeles"},
    {"Name": "Charlie", "Age": 35, "City": "Chicago"}
]

# Open CSV file in write mode
with open("output_dict.csv", mode="w", newline="") as file:
    # Specify fieldnames
    fieldnames = ["Name", "Age", "City"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    # Write headers
    writer.writeheader()
    # Write data rows
    writer.writerows(data_dict)
print("Data written to CSV using DictWriter successfully.")

Output:

Data written to CSV using DictWriter successfully.
Explanation: DictWriter is useful when data is stored as dictionaries, allowing each row to be written using dictionary keys.

7. Reading Data with a DictReader

Use csv.DictReader to read CSV data where each row is converted to a dictionary with column headers as keys.
# Read CSV file with DictReader
with open("output_dict.csv", mode="r") as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)

Output:

{'Name': 'Alice', 'Age': '25', 'City': 'New York'}
{'Name': 'Bob', 'Age': '30', 'City': 'Los Angeles'}
{'Name': 'Charlie', 'Age': '35', 'City': 'Chicago'}
Explanation: DictReader reads each row as a dictionary, with keys derived from the header row. This is useful for named access to each value.

8. Handling Quotes and Special Characters

By default, csv will wrap values containing special characters (like commas) in double quotes. Use the quotechar argument to specify the quote character.
# Data with a comma in the text
data_special = [
    ["Name", "Age", "City"],
    ["Alice", 25, "New York, NY"],
    ["Bob", 30, "Los Angeles, CA"]
]

# Write data with special characters
with open("output_quotes.csv", mode="w", newline="") as file:
    writer = csv.writer(file, quotechar='"', quoting=csv.QUOTE_MINIMAL)
    writer.writerows(data_special)
print("Data with special characters written successfully.")

Output:

Data with special characters written successfully.
Explanation: Setting quotechar to " wraps values with special characters in quotes. csv.QUOTE_MINIMAL wraps only when necessary.

9. Handling Line Terminators

Set the lineterminator argument to specify the line ending character. This can be useful for compatibility with different platforms.
# Write data with custom line terminator
with open("output_terminator.csv", mode="w", newline="") as file:
    writer = csv.writer(file, lineterminator="\r\n")
    writer.writerows(data)
print("Data written with custom line terminator successfully.")

Output:

Data written with custom line terminator successfully.
Explanation: Setting lineterminator="\r\n" is useful for writing files compatible with Windows.

10. Summary

The Python csv module provides efficient functionality for reading from and writing to CSV files, supporting different delimiters, quote characters, and formats through csv.reader, csv.writer, DictReader, and DictWriter. These options enable handling of various CSV formats for effective data management.

Previous: Python Sqlite3 | Next: Python subprocess

<
>